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OM nucleic - nucleic search, using sw model 
Run on: March 30, 2004, 02:03:16 



Search time 8287 Seconds 

(without alignments) 

12334.771 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 



US-10-092-390-1 
3423 

1 atggttatttctttgaactc gcagcagcagcagtgaatga 3423 



Scoring table: IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

Searched: 27513289 seqs, 14931090276 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



55026578 



Database : 



EST: * 
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em_estba : * 
em_esthum: * 
em__estin: * 
era_estmu : * 
em_estov: * 
em_estpl : * 
em^estro : * 
em_htc : * 
gb_estl : * 
gb_est2 : * 
gb_htc: * 
gb_est3 : * 
gb__est4 : * 
gb_est5 : * 
em_estfun : * 
em_es torn: * 
em_gss_hum: * 
em_gss_inv: * 
em_gss__pln : * 
em gss_vrt:* 
em_gss_fun: * 
. em_gss_mam:* 
em_gss_mus : * 
em_gss_pro : * 
em_gss_rod: * 
em_gss_phg : * 
em_gss_vrl : * 



28: gb_gssl:* 
29: gb_gss2:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 



No. 


Score 


Match 


Length 


DB 


ID 


Description 


1 


1334.2 


39. 


0 


3162 


11 


BC029999 


BC029999 


Homo sapi 


2 


992.6 


29. 


0 


2910 


29 


AY406554 


AY406554 


Homo sapi 


3 


972.6 


28. 


4 


3568 


11 


AK051642 


AK051642 


Mus muscu 


4 


933.2 


27 . 


3 


2910 


29 


AY406556 


AY406556 


Mus muscu 


5 


892.6 


26. 


1 


3466 


11 


AK032661 


AK032661 


Mus muscu 


6 


653. 6 


19. 


1 


2910 


29 


AY406555 


AY406555 


Pan trogl 


7 


646.2 


18 . 


9 


3556 


11 


AK053551 


AK053551 


Mus muscu 


8 


587.6 


17. 


2 


789 


14 


CD803668 


CD803668 


UI-M-GV0- 


9 


539 


15. 


7 


921 


13 


BU215784 


BU215784 


603106167 


10 


530.4 


15. 


5 


755 


14 


CD802967 


CD802967 


UI-M-GV0- 


11 


508.6 


14 . 


9 


643 


13 


BU056532 


BU056532 


UI-M-FO0- 


12 


470 


13. 


7 


2944 


11 


AK048840 


AK048840 


Mus muscu 


13 


456. 6 


13. 


3 


755 


12 


BG828819 


BG828819 


602751395 


14 


433 


12. 


6 


565 


12 


BM719978 


BM719978 


UI-E-EJ0- 


c 15 


432 


12. 


6 


598 


12 


BM676825 


BM676825 


UI-E-EJ0- 


16 


417.4 


12. 


2 


846 


13 


BX741190 


BX741190 


BX741190 


17 


405.4 


11. 


8 


649 


9 


AL860947 


AL860947 AL860947 


18 


398. 8 


11. 


7 


798 


13 


BU112175 


BU112175 


603126134 


19 


379 


11. 


1 


538 


10 


BB762574 


BB762574 


BB762574 


20 


370 


10. 


8 


715 


9 


AI 9 5 8 9 0 9 


AI958909 fd05g03.y 


21 


367 . 6 


10. 


7 


438 


12 


BM484594 


BM484594 


538509 MA 


22 


366. 8 


10. 


7 


779 


10 


BF529240 


BF529240 


602041695 


23 


359. 6 


10. 


5 


641 


10 


BB623063 


BB623063 


BB623063 


24 


359. 6 


10. 


5 


744 


13 


BY741291 


BY741291 


BY741291 


25 


354 . 6 


10. 


4 


488 


14 


H15471 


H15471 ym29b02.rl 


26 


352 . 8 


10. 


3 


734 


14 


CF745181 


CF745181 


UI-M-GV0- 


27 


328 . 6 


9. 


6 


772 


14 


CF538zl9 


CF538219 


UI-M-GH0- 


28 


32 8.2 


9. 


6 


584 


10 


BF68 6873 


BF686873 


602102830 


29 


318 . 8 


9. 


3 


694 


13 


BY7 34 9 08 


BY734908 


BY734908 


30 


313 . 4 


9. 


2 


/ b(J 


13 


BY / Ujj Dj 


BY705363 


BY705363 


31 


309.4 


9. 


0 


1123 


12 


BM563529 


BM563529 


AGEN COURT 


32 


302 


8. 


8 


559 


10 


AW658138 


AW658138 


93921 MAR 


33 


301.2 


8. 


8 


563 


14 


CF538582 


CF538582 


UI-M-GI0- 


34 


296.2 


8. 


7 


937 


10 


BF180097 


BF180097 


601806458 


35 


294.4 


8. 


6 


852 


9 


AL040177 


AL040177 3 


DKFZp434F 


36 


293.2 


8. 


6 


401 


10 


BE664785 


BE664785 


152639 MA 


37 


283.4 


8. 


3 


451 


14 


CB786815 


CB786815 


AMGNNUC : N 


38 


282.2 


8 . 


2 


463 


10 


AW445546 


AW445546 


81787 MAR 


39 


271.2 


7. 


9 


477 


10 


BF442243 


BF442243 


258883 MA 


40 


268.8 


7. 


9 


693 


14 


CF739003 


CF739003 


UI-M-HD0- 


41 


260 


7. 


.6 


501 


10 


AW478869 


AW478869 


22097 MAR 


42 


255 


7 . 


,4 


634 


13 


BY726500 


BY726500 


BY726500 


43 


253.6 


7. 


.4 


634 


10 


BB650216 


BB650216 


BB650216 


44 


251 


7. 


,3 


843 


13 


BU282192 


BU282192 


603862751 


45 


250.2 


7. 


.3 


600 


14 


CA893118 


CA893118 


B0177A01- 



ALIGNMENTS 



RESULT 1 
BC029999 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REMARK 
COMMENT 



BC029999 3162 bp mRNA linear HTC 06-MAY-2002 

Homo sapiens, clone IMAGE : 4 156083, mRNA. 

BC029999 

BC029999.1 GI:20455873 
HTC. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 3162) 
Strausberg, R. 
Direct Submission 

Submitted ( 06-MAY-2 002 ) National Institutes of Health, Mammalian 
Gene Collection (MGC) , Cancer Genomics Office, National Cancer 
Institute, 31 Center Drive, Room 11A03, Bethesda, MD 20892-2590, 
USA 

NIH-MGC Project URL: http://mgc.nci.nih.gov 

Contact: MGC help desk 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: David N. Louis, M.D. 

cDNA Library Preparation: Life Technologies, Inc. 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Baylor College of Medicine Human Genome 

Sequencing Center 

Center code: BCM-HGSC 

Web site: http://www.hgsc.bcm.tmc.edu/cdna/ 
Contact: amg@bcm.tmc.edu 

Gunaratne, P.H., Garcia, A.M., Lu, X., Hulyk, S.W., Hale, S.M., 
Yoon, V.S., Kowis, C.R., Lawrence, S., Martin, R.G., Muzny, D.M., 
Richards, S., Gibbs, R.A. 



Clone distribution: MGC clone distribution information can be found 
through the I.M.A.G.E. Consortium/LLNL at: http://image.llnl.gov 
Series: IRAK Plate: 52 Row: a Column: 10 

This clone was selected for full length sequencing because it 
passed the following selection criteria: matched mRNA gi : 14192940 
This clone has the following problem: no 5 1 EST match. 
FEATURES Location/Qualifiers 
source 1. .3162 

/organism="Homo sapiens" 
/ mo l_t yp e = " mRNA" 
/db_xref="taxon: 9606" 
/ cl one= " IMAGE :4156083" 

/tissue_type="Brain, anaplastic oligodendroglioma with 
lp/19q loss" 

/clone_lib="NCI_CGAP_Brn67" 
/lab_host="DHl0B" 
/note="Vector: pCMV-SPORT6" 

ORIGIN 



Query Match 39.0%; Score 1334.2; DB 11; Length 3162; 

Best Local Similarity 69.4%; Pred. No. 0; 

Matches 1845; Conservative 0; Mismatches 808; Indels 6; Gaps 



Qy 


74 


CT CT GAAT CT T GAAGAC CCT AAT GT GT GT AGC C AC T GGGAAAGCT AC T CAGT GACT GT GC 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 M 1 1 I 

CCCTGAACCCCGAGGACCCCAACGTGTGCAGCCACTGGGAGAGCTATGCTGTGACTGTCC 


133 


Db 


229 


288 


Qy 

Db 


134 
289 


AAGAGT CAT AC C C ACAT C C CTT T GAT CAAATTT ACT ACAC GAGCT GCACT GAC ATT CT AA 
1 1 1 M II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
AG GAAT C GT AT G C AC AC C C C T T C GAT C AG AT C T AT T AC AC AC GAT G C AC AG AC AT CCT C A 


193 
348 


Qy 


194 


ACT GGT TTAAAT GCAC GCGG CACAGAGT C AGCT AT C GGAC AGCCT AT C GACAT GG G GAGA 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 Mill 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 

ACT GGT T CAAGT GC ACCAGGCAC CGGAT CAGT T AT AAGAC GG C GT AT C GGAGAGGC CT C C 


253 


Db 


349 


408 


Qy 


254 


AGACT AT GT AT AGGC GCAAGT CT CAGT GT T GT C CT G GAT TTT AT GAAAGC GG GGAAAT GT 

MINIM II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 Ill II 

GGACCATGTACCGGCGGAGGTCCCAGTGCTGCCCTGGCTACTATGAGAGCGGAGACTTCT 


313 


Db 


409 


468 


Qy 


314 


GTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCTCCAAACACCTGTC 

I 1 MIJ 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M 1 1 

GCATACCCCTGTGTACGGAGGAGTGTGTGCACGGCCGCTGCGTTTCCCCGGACACCTGCC 


373 


Db 


469 


528 


Qy 

Db 


374 

529 


AGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGATGGTGATCACTGGG 

1 II 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 

ACTGCGAGCCTGGCTGGGGAGGGCCCGACTGCTCCAGCGGCTGCGACAGCGACCACTGGG 


433 
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Qy 
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GTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGCAACCCCATCACCG 

I I 1 1 1 1 M 1 1 1 II 1 1 1 1 1 1 1 1 1 M 1 1 1 II II II Mill 1 1 1 1 1 II 1 1 II 1 

GGCCCCACTGCAGCAACCGGTGCCAGTGCCAGAACGGCGCCCTGTGTAACCCCATCACAG 


493 


Db 


589 


648 


Qy 

Db 
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649 


GGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGACCGCTGTGAGCAGG 

1 M 1 1 1 II II II II II II 1 1 M 1 1 1 1 M II 1 1 1 II 1 1 II 1 1 M 1 

GCGCCTGCGTGTGCGCCGCCGGCTTCCGTGGATGGCGCTGCGAGGAGCTCTGCGCGCCTG 


553 
708 


Qy 


554 


G C AC CT AT G G T AAC GACT GT CAT C AG AG AT G C C AG T G C C AG AAT G GAG C C AC C T G C GAC C 

1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 II II 1 1 1 1 M 1 1 1 1 1 II 1 M II 1 M 

GCACCCACGGCAAGGGATGCCAGCTGCCGTGCCAGTGCCGACACGGTGCCAGCTGCGACC 


613 


Db 


709 


768 


Qy 


614 


AC GT C AC GGG GG AAT GCCGCTGCC C AC CAGGAT AC AC C G GAGC CT T CT GT GAGGAT CT T T 

1 1 1 II II MM MM II 1 1 II 1 1 1 II II 1 1 II III Mill II 1 

CCCGCGCCGGCGAGTGCCTCTGCGCACCTGGCTACACCGGCGTCTACTGCGAGGAGCTGT 


673 


Db 


769 


828 


Qy 


674 


GT C CT C C T GGT AAAC AT G GT C C AC AGT GT GAGC AGAGAT G C C CTT GT C AAAAT GGAGGAG 

1 1 1 1 1 1 M 1 1 1 1 1 II 1 II II II II 1 1 1 1 1 II 1 II 1 1 1 II II 1 1 1 

GCCCTCCTGGGAGCCATGGAGCTCACTGTGAGCTGCGCTGCCCCTGTCAGAATGGGGGCA 


733 


Db 


829 


888 


Qy 


734 


TGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGATGGGCACAGTGTGTG 

I I I 1 II 1 II 1 1 1 1 1 II II 1 II M 1 1 1 1 II II 1 1 1 1 1 1 II 1 1 1 II 

CCTGCCACCACATCACTGGCGAGTGTGCCTGCCCCCCAGGCTGGACGGGAGCAGTGTGTG 


793 


Db 


889 


948 


Qy 


794 


GTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAATGCCAGTGCCATA 

II 1 II II 1 1 1 II II 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 Mill 

CCCAGCCCTGCCCACCAGGGACATTTGGCCAGAACTGCAGCCAGGATTGTCCTTGCCACC 


853 


Db 


949 


1008 


Qy 


854 


AT G GAGG GAC GT GT GAT GCT G CCACAGGCCAAT GT CAT T GC AGT CCAG GAT ACACAGG GG 


913 



1009 AT GGAG GGCAGT GT GAC CAC GT GACT GGACAGT GC CACT GT AC AGCT GGAT ACAT GGGG G 1068 

914 AAC G GT GC CAGGAT GAGT GT C CT GT T GGGAC CT AT GG C GTTCTCTGTGCT GAGAC CT G C C 973 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I IN 

1069 ACAGGTGCCAAGAGGAGTGCCCCTTCGGGTCCTTCGGCTTCCAGTGCTCACAGCGCTGTG 112 8 

974 AGT GT GT C AAC G GAG G GAAGT GT T AC CAC GT GAG C GGC G C AT GC CT CT GT GAAGC AGGCT 1033 

III I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
1129 ACTGCCACAATGGGGGGCAGTGTTCACCCACCACGGGTGCCTGCGAGTGTGAGCCTGGCT 1188 

1034 TTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAATGTG 1093 

IN I I I I I I I I I I I I I I I I I I I I I I II I I I I II 

1189 ACAAGGGCCCACGCTGCCAGGAGCGACTGTGCCCGGAGGGCCTGCATGGCCCAGGCTGCA 1248 

1094 ACAAACGGTGTCCCTGCCACTTGGAAAACACTCATAGCTGTCACCCCATGTCTGGAGAGT 1153 

I i I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

1249 CCCTGCCCTGCCCCTGTGACGCTGACAACACCATCAGCTGCCACCCAGTAACTGGAGCTT 1308 

1154 GTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGTTCTCCTGGATTCT 1213 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III II I I I I I I 

1309 GTACCTGCCAGCCAGGCTGGTCTGGTCACCACTGCAATGAATCCTGCCCTGTTGGCTACT 1368 

1214 ACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGACTGTGACAGTGTGA 1273 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II II 

1369 ATGGCGATGGCTGCCAGCTGCCTTGCACCTGTCAGAATGGCGCCGACTGCCACAGCATCA 1428 

1274 CT GGAAAGT GC AC CT GT GC C C CAGGAT T CAAAG GAAT T GACT G CT CT AC C C CAT G C C CT C 1333 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I IN I 

1429 CTGGGGGCTGCACTTGTGCTCCGGGCTTCATGGGAGAGGTCTGTGCCGTTTCCTGTGCAG 1488 

1334 TGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCT 1393 

I I I I I I I I I I III I I I I II I I I I I I I M I I 

14 8 9 CAGGGACCTATGGCCCCAACTGCTCGTCCATCTGTAGCTGTAACAATGGTGGCACCTGCT 1548 

1394 CTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCA 14 53 

I II II II II I I I I I I I I I I I I I I III I I I I I II I I I I I I I I II I 

154 9 CCCCAGTAGATGGCTCCTGTACCTGCAAGGAAGGGTGGCAGGGCCTGGACTGCACCCTGC 1608 

1454 GATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGG 1513 

I I I I I I I I I I I I I I I I I I I I I I II I I I I III M I I I I I I I 

1609 CATGTCCCAGTGGGACGTGGGGCCTGAACTGCAACGAGAGCTGCACCTGTGCCAATGGGG 1668 

1514 GAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAAT 1573 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I M I I I I I I I I 

1669 CAGCCTGCAGCCCCATAGACGGCTCCTGCTCCTGCACTCCTGGCTGGCTGGGAGACACCT 1728 

1574 GCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCA 1633 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 

1729 GTGAGCTGCCTTGCCCGGATGGCACATTTGGGCTGAACTGCAGTGAACACTGTGACTGCA 1788 

1634 GCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCGGGATGGTCAG 1693 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
1789 GCCATGCTGATGGATGTGACCCCGTCACAGGCCACTGCTGCTGCCTGGCCGGATGGACAG 1848 

1694 GTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCT 1753 
I | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml 
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1849 


Qy 


1754 


Db 


1909 


Qy 


1814 


Db 


1969 


Qy 


1874 


Db 


2029 


Qy 


1934 


Db 


2089 


Qy 


1994 


Db 


2149 


Qy 


2054 


Db 


2209 


Qy 


2114 


Db 


2269 


Qy 


2174 


Db 


2329 


Qy 


2234 


Db 


2389 


Qy 


2294 


Db 


2449 


Qy 


2354 


Db 


2509 


Qy 


2414 


Db 


2569 


Qy 


2474 


Db 


2629 


Qy 


2534 


Db 


2686 



GCATCCGCTGTGACAGCACGTGTCCACCTGGCCGCTGGGGCCCCAACTGCTCTGTCTCCT 1908 

GCTACT GTAAAAAT GGGGCTT CAT GCT CC C CT GAT GAT GGCAT C T GC GAGT GT GCAC C AG 1813 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GCAGCTGTGAGAATGGAGGCTCCTGCTCCCCAGAGGATGGGAGCTGCGAGTGTGCCCCTG 1968 

GCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCA 1873 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

GCTTCCGAGGACCCTTATGCCAGAGAATCTGCCCCCCTGGGTTCTATGGCCACGGCTGCG 2028 

GC CAGAC AT GC C CACAGT GC GT T C ACAGCAGC GGGC C CT GC C AC C AC AT C AC CGGCCTGT 1933 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

CCCAGCCATGCCCCCTCTGCGTGCACAGCAGCAGGCCCTGCCACCACATCAGCGGCATCT 2088 

GTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCAGAT 1993 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I 

GTGAGTGCCTCCCAGGATTCTCTGGAGCTCTCTGCAACCAAGTGTGTGCTGGAGGATACT 2148 

T T GG GAAAAACT GT G C AGGAAT T T GT AC CT GCAC CAAC AAC GGAAC C T GT AAC C C CAT T G 2053 

I I I I I I I III I I I I I I I I M I I II 

TTGGGCAGGACTGTGCCCAGCTCTGCTCCTGTGCCAACAACGGGACCTGCAGCCCTATCG 2208 

ACAGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCTCAACCATGTCCAC 2113 

| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ATGGCTCCTGCCAGTGCTTTCCTGGATGGATTGGCAAGGACTGCTCACAGGCTTGCCCAC 2268 

CTGCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCTGCA 2173 

II MINIMI I I I I I I I I I I I I I I I I I I I I I I I I I I Mill 
CCGGGTTCTGGGGCCCCGCCTGCTTCCACGCATGCAGCTGCCACAACGGGGCGAGCTGCA 2328 



GCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTCTACTGCACTCAGA 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GCGCCGAGGACGGGGCCTGCCACTGCACCCCTGGCTGGACTGGACTCTTCTGCACACAGC 

GAT GT C CT CTAG GGT T T TAT GGAAAAGAT T GT G CACT GAT AT GC CAAT GT C AAAAC GG AG 

I I I I I II I I I I I I I I I I I I I I I I M I I I I I I I I I 

GCTGCCCAGCAGCATTTTTTGGGAAGGACTGTGGGCGCGTATGCCAGTGTCAGAATGGCG 



2233 



2388 



2293 



2448 



2353 



CTGACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTCATGGGACGGCACT 

I II I I I I I I I I I I I I I I I I I I 

CCAGCTGT GACCACAT CAGTGGCAAGT GCACCT GCCGCACAGGCTT CAC C GGGCAACACT 2508 

GT GAGCAGAAGT GC C CTT CAGGAACATAT GGCT AT GGCT GT CGCCAGAT AT GT GATT GT C 2413 

MINIMI M I I II I I II I I I I I I I I M II I I II I M I I I I I I I 

GT GAG CAGAGAT GT GC CC C AGGAAC CT TT GGCT AT GG GT GTC AGC AGCT AT GT GAGT GC A 2568 

TGAACAACTCCACCTGCGACCACATCACTGGGACCTGTTACTGCAGCCCCGGATGGAAGG 24 73 

II II II I II I II II I II M I I I M II I M I 

T GAACAACT CC AC CT GT GACCAT GT CACC G GCACCT GTT ACT G CAGC C C T GGCTT CAAAG 2 628 

GAGC GAGAT GT GAT CAAGCT GGT GTT AT CAT AGTT GGAAAT CT GAACAGCTT AAGCCGAA 2533 

II I I I II I I I I I I I M I I I I I I I I I I I I I II M I 



C CAGT ACT GCT CT CC CT GCT GAT T C CT AC CAGAT C G GGG CCAT T GCAG GCAT CAT CAT T C 2593 

Ml I II I I I II II II I I M I I I I I II I I I I M II 

TCAGCCCAGCACTGGGTGCAGAGCGGCACTCGGTGGGTGCTGTCACAGGCATCATGCTCC 2745 



Qy 2594 T T GT C CTAGTT GT T CT CTT C CT ACT GGCAT T GT T CAT T ATTT ATAGACACAAGC AGAAG G 2653 

I I I I III I I I I I I I III II I I I I I I I I I 

Db 2746 TGTTATTCCTCATTGTGGTGCTGCTGGGCCTATTTGCCTGGCATCGGCGGCGGCAGAAAG 28 05 

Qy 2654 GAAAG GAAT C AAGC AT G C C AG C AGT T AC CT AC AC C C C T G C TAT GAG GGT C GT C AAT G 2710 

I I I I I III II II I I I I I I I I I I I I I I I I I I I II 

Db 2806 AGAAG GGC C GAGAC CT GGCT CC C CGT GT CT CCT AC ACAC CT GCCAT GAG GAT GAC CAGCA 2865 

Qy 2711 CAGATTATACCATTTCAGG 2729 

I I I I I I I I I I I I I 
Db 2866 CCGACTACTCCCTCTCAGG 2884 



RESULT 2 
AY406554 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 
JOURNAL 

COMMENT 

FEATURES 

source 



gene 
ORIGIN 



GI:39762525 



(human) 



AY406554 2910 bp DNA linear GSS 12-DEC-2003 

Homo sapiens HCM2596 gene, VIRTUAL TRANSCRIPT, partial sequence, 
genomic survey sequence. 
AY406554 
AY406554.1 
GSS. 

Homo sapiens 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 2910) 

Clark, A. G., Glanowski, S . , Nielson,R., Thomas, P., Kejariwal, A. , 
Todd, M. A., Tanenbaum, D.M. , Civello, D . R. , Lu,F., Murphy, B., 
Ferriera,S., Wang,G., Zheng, X.H., White, T. J., Sninsky, J . J . , 
Adams, M.D. and Cargill,M. 

Inferring nonneutral evolution from human- chimp-mouse orthologous 
gene trios 

Science 302 (5652), 1960-1963 (2003) 
14671302 

2 (bases 1 to 2910) 

Clark, A. G., Glanowski, S . , Nielson,R., Thomas, P., Ke j ariwal , A. , 
Todd, M. A., Tanenbaum, D.M. , Civello, D . R. , Lu,F., Murphy, B., 
Ferriera,S., Wang,G., Zheng, X.H., White, T. J., Sninsky, J . J . , 
Adams ,M.D. and Cargill,M. 
Direct Submission 

Submitted ( 16-NOV-2003 ) Celera Genomics, 45 West Gude Drive, 
Rockville, MD 20850, USA 

This sequence was made by sequencing genomic exons and ordering 
them based on alignment. 

Location/Qualifiers 

1. .2910 

/organism =,, Homo sapiens" 
/mol_type=" genomic DNA" 
/db_xref="taxon: 9606" 
<1. .>2910 

/locus_tag="HCM2596" 



Query Match 29.0%; Score 992.6; DB 29; Length 2910; 

Best Local Similarity 57.8%; Pred. No. 2.3e-272; 

Matches 1396; Conservative 0; Mismatches 1013; Indels 6; Gaps 



2; 



Qy 318 CCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCTCCAAACACCTGTCAGTG 377 

I I I I I I I I I I I I II I I I I I I I I I I I II III I I I I I I I I I I I 

Db 75 CGCCCTGTGTACGGAGGAGTGTGTGCACGGCCGCTGCGTTTCCCCGGACACCTGCCACTG 134 . 

Qy 37 8 TGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGATGGTGATCACTGGGGTCC 4 37 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 135 CGAGCCTGGCTGGGGAGGGCCCGACTGCTCCAGCGNNNNNNNNNNNNNNNNNNNNNNNNN 194 

Qy 43 8 CCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGCAACCCCATCACCGGGGC 4 97 

Db 195 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 2 54 

Qy 4 98 TTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGACCGCTGTGAGCAGGGCAC 557 

Db 255 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 314 

Qy 558 CT AT GGTAACGACTGT CAT CAGAGAT GCCAGT GCCAGAAT GGAGC CAC CT GCGACCACGT 617 

Db 315 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 374 

Qy 618 CACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTCTGTGAGGATCTTTGTCC 677 

Ml I I I I I II II II 

Db 375 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCTGCGAGGAGCTGTGCCC 4 34 

Qy 678 T CCT GGT AAACAT GGT CCACAGT GT GAGCAGAGAT GCCCTT GT CAAAATGGAGGAGT GT G 7 37 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I II 

Db 435 TCCTGGGAGCCATGGAGCTCACTGTGAGCTGCGCTGCCCCTGTCAGAATGGGGGCACCTG 494 

Qy 738 TCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGATGGGCACAGTGTGTGGTCA 7 97 

I I I I I I I I I I I i I I I I I II I I I I I I I I I I I I 

Db 4 95 CCACCACATCACTGGCGAGTGTGCCTGCCCCCCAGGCTGGACGNNNNNNNNNNNNNNNNN 554 

Qy 7 98 GCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAATGCCAGTGCCATAATGG 857 

Db 555 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 614 

Qy 858 AGGGACGT GT GAT GCT GC CAC AGGC CAAT GT C ATT GCAGT C CAGGAT AC AC AGG GGAAC G 917 

Db 615 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 674 

Qy 918 GTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGTGCTGAGACCTGCCAGTG 977 

I I I I II I I I I I I I I I I I I I I I I Mill II I M III I II 

Db 675 GTGCCAAGAGGAGTGCCCCTTCGGGTCCTTCGGCTTCCAGTGCTCACAGCGCTGTGACTG 734 

Qy 97 8 TGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTCTGTGAAGCAGGCTTTGC 1037 

II I I I II I II I I I I II I M I I I I I I I I I I I I I I I 

Db 735 CCACAATGGGGGGCAGTGTTCACCCACCACGGGTGCCTGCGAGTGTGAGCCTGGCTACAA 7 94 

Qy 1038 TGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAATGTGACAA 1097 

III I II II I I I II II I I I I I I II II I I I I I I II I 

Db 7 95 GGGCCCACGCTGCCAGGAGCGACTGTGCCCGGAGGGCCTGCATGGCCCAGGCTGCACCCT 854 

Qy 1098 AC G GT GT C C CT GC C ACTT GGAAAAC ACT C AT AGCT GT CAC C C CAT GT CT G GAGAGT GT G C 1157 

I I I I I I I I II M I M I I I I II I I II II I MIMI I I I I 

Db 855 GCCCTGCCCCTGTGACGCTGACAACACCATCAGCTGCCACCCAGTAACTGGAGCTTGTAC 914 



Qy 

Db 

Qy 

Db 

Qy 

Db 
Qy 

Db 

Qy 

Db 
Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 
Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



1158 C T GCAAGCC GGGC T GGT CAG GAC T CT ACT GT AAT GAGACAT GTT CT C CT GGAT T CT ACGG 1217 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I II I I I I I I I 

915 CTGCCAGCCAGGCTGGTCTGGTCACCACTGCAATGAATCCTGCCCTGTTGGCTACTATGG 974 

1218 GGAAGCT T GC CAGC AGAT CT GCAGCT GC CAAAAT GGGGC AGACT GTGACAGT GT GACT G G 1277 

III I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

975 CGATGGCTGCCAGCTGCCTTGCACCTGTCAGAATGGCGCCGACTGCCACAGCATCACTGG 1034 

1278 AAAGT GCAC CT GT G CC C CAG GAT T CAAAGGAAT T GACT GCT CT ACC C CAT G CC CT CT GGG 1337 

I I I I I I II I I I I I I I I I I III I I I I I III I M 

1035 GGGCTGCACTTGTGCTCCGGGCTTCATGGGAGAGGTCTGTGCCGTTTCCTGTGCAGCAGG 1094 

1338 AACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCTCC 13 97 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

1095 GACCTATGGCCCCAACTGCTCGTCCATCTGTAGCTGTAACAATGGTGGCACCTGCTCCCC 1154 

1398 TGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGATG 1457 

I I I I II I I I I I I I I I II I I I I I I I I I I I M I I I I I I I I I I I Ml 
1155 AGTAGATGGCTCCTGTACCTGCAAGGAAGGGTGGCAGGGCCTGGACTGCACCCTGCCATG 1214 

1458 TCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGGGAGC 1517 

I I I I I I I I I I I I I I I I I I I I I I I I I III M I I I I I I I I I I 

1215 TCCCAGTGGGACGTGGGGCCTGAACTGCAACGAGAGCTGCACCTGTGCCAATGGGGCAGC 1274 

1518 CTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAATGCGA 1577 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1275 CTGCAGCCCCATAGACGGCTCCTGCTCCTGCACTCCTGGCTGGCTGGGAGACACCTGTGA 1334 

1578 ACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGCCA 1637 

I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1335 GCTGCCTTGCCCGGATGGCACATTTGGGCTGAACTGCAGTGAACACTGTGACTGCAGCCA 1394 

1638 CGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCGGGATGGTCAGGTGT 1697 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1395 TGCTGATGGATGTGACCCCGTCACAGGCCACTGCTGCTGCCTGGCCGGATGGACAGGCAT 1454 

1698 CCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGCTA 1757 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

1455 CCGCTGTGACAGCACGTGTCCACCTGGCCGCTGGGGCCCCAACTGCTCTGTCTCCTGCAG 1514 

1758 CTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCACCAGGCTT 1817 

I M I I I I I I I I II I I I M I I I II Mill I I I I I I I I I I I I I II I II I I 

1515 CTGTGAGAATGGAGGCTCCTGCTCCCCAGAGGATGGGAGCTGCGAGTGTGCCCCTGGCTT 1574 

1818 C C GAGGC AC CACT T GT CAGAGGAT CT GCT C C C CT GGT TTT T AT GGG CAT C GCT GCAGC C A 1877 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
1575 CCGAGGACCCTTATGCCAGAGAATCTGCCCCCCTGGGTTCTATGGCCACGGCTGCGCCCA 1634 

1878 GACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCACATCACCGGCCTGTGTGA 1937 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II II I I I I I I II I I • 

1635 GCCATGCCCCCTCTGCGTGCACAGCAGCAGGCCCTGCCACCACATCAGCGGCATCTGTGA 1694 

1938 CTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCAGATTTGG 1997 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I II I I 

1695 GTGCCTCCCAGGATTCTCTGGAGCTCTCTGCAACCAAGTGTGTGCTGGAGGATACTTTGG 1754 

1998 GAAAAACT GTGCAGGAATTT GTACCT GCACCAACAACGGAACCTGTAAC CCCATTGACAG 2057 



Db 


1755 


Qy 


2058 


Db 


1815 


Qy 


2118 


Db 


1875 


Qy 


2178 


Db 


1935 


Qy 


2238 


Db 


1995 


Qy 


2298 


Db 


2055 


Qy 


2358 


Db 


2115 


Qy 


2418 


Db 


2175 


Qy 


2478 


Db 


2235 


Qy 


2538 


Db 


2292 


Qy 


2598 


Db 


2352 


Qy 


2658 


Db 


2412 


Qy 


2715 


Db 


2472 



M I I I I I I I III MM I I I I I I I I I I I I II I I II I M II I 

GCAGGACTGTGCCCAGCTCTGCTCCTGTGCCAACAACGGGACCTGCAGCCCTATCGATGG 1814 

ATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCTCAACCATGTCCACCTGC 2117 

I I I I I I I I I I I I I I I II II II I M I I I I I II I I I I I I I M II I 

CTCCTGCCAGTGCTTTCCTGGATGGATTGGCAAGGACTGCTCACAGGCTTGCCCACCCGG 1874 

CCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCTGCAGCGC 2177 

I I I M II II M I I M M I I M I I I M I I I II II II I I II I I M I 

GTTCTGGGGCCCCGCCTGCTTCCACGCATGCAGCTGCCACAACGGGGCGAGCTGCAGCGC 1934 

CTACGATGGGGAAT GTAAAT GCACT CCTGGCTGGACAGGGCTCTACTGCACT CAGAGATG 2237 

I I I I I I I I II I I I I I I I I I II II II II I I I I I I II I II I M I I I I 

CGAGGACGGGGCCTGCCACTGCACCCCTGGCTGGACTGGACTCTTCTGCACACAGCGCTG 1994 

TCCTCTAGGGTTTTATGGAA7^AGATTGTGCACTGATATGCCAATGTCAAAACGGAGCTGA 2297 

II || I I I I I I I I I I I I I II I I I I II I I I I I I I II II M 

CCCAGCAGCATTTTTTGGGAAGGACTGTGGGCGCGTATGCCAGTGTCAGAATGGCGCCAG 2054 

CTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTCATGGGACGGCACTGTGA 2357 

II I I II II I II III I I I I I I I I II I I M II II II III I I II I M I 

CTGTGACCACATCAGTGGCAAGTGCACCTGCCGCACAGGCTTCACCGGGCAACACTGTGA 2114 

GC AGAAGT GC CCT T CAGGAAC AT AT GG CT AT GGC T GT C GC C AGAT AT GT GATT GT CT G AA 2 417 

Mill II I I I I I I I I I II II I II II I II I I I I II I I II I I I I I II 

GCAGAGAT GT GC C C CAGGAAC CTT T GGCTAT GG GT GT CAGCAG CTAT GT GAGT GCAT GAA 2174 

CAACTCCACCTGCGACCACATCACTGGGACCTGTTACTGCAGCCCCGGATGGAAGGGAGC 2477 

I I M I I II I I I I I II I I II II M I II II I I M M II II II II I M ill 

CAACTCCACCTGTGACCATGTCACCGGCACCTGTTACTGCAGCCCTGGCTTCAAAGGAAT 2234 

GAGAT GT GAT C AAGCT GGT GT TAT C AT AGT T G GAAAT CT G AAC AGC T T AAG C C GAAC C AG 2537 
II I I I I I I I I I I I I I I I I I I I I II I M M II I I M 



TACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCAGGCATCATCATTCTTGT 2597 

I I I I I | | I I I II I I I I I I I II I II II I II Ml I 

CCCAGCACTGGGTGCAGAGCGGCACTCGGTGGGTGCTGTCACAGGCATCATGCTCCTGTT 2351 

CCT AGTT GT T CTCTT C CT ACT GGCAT T GTT CAT TAT T T AT AGACACAAGCAGAAGGGAAA 2657 

I I I I I II I I I I II 

ATTCCTCATTGTGGTGCTGCTGGGCCTATTTGCCTGGCATCGGCGGCGGCAGAAAGAGAA 2411 

GGAATCAAGCATG C C AGC AGT T AC CTAC AC C C CT GCT AT GAG G GT C GT CAAT GC AGA 2714 

II I I I M I I I 

GGGC C GAGAC CT GGCT CCC C GT GT CT C CT AC AC ACCT GC C AT GAGGAT GAC CAGCAC CGA 2471 

T TAT AC CAT T T C AGG 272 9 

II I I I II II I 

CTACTCCCTCTCAGG 24 8 6 



RESULT 3 
AK051642 

LOCUS AK051642 3568 bp mRNA linear HTC 20-SEP-2003 

DEFINITION Mus musculus 12 days embryo spinal ganglion cDNA, RIKEN full-length 
enriched library, clone : D130061K05 product :MEGF11 PROTEIN 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 
PUBMED 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
REFERENCE 
AUTHORS 



Carninci, P . 

Itoh,M. , 
Harada,A. , 



(KIAA1781) homolog [Homo sapiens], full insert sequence. 
AK051642 

AK051642.1 GI:26342079 
HTC; CAP trapper. 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 

Carninci, P. and Hayashizaki, Y. 

High-efficiency full-length cDNA cloning 

Meth. Enzymol. 303 f 19-44 (1999) * 

99279253 

10349636 

2 

Carninci, P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 

Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki , Y . 

Normalization and subtraction of cap-trapper-selected cDNAs to 

prepare full-length cDNA libraries for rapid discovery of new gene 

Genome Res. 10 (10), 1617-1630 (2000) 

20499374 

11042159 

3 

Shibata,K., Itoh,M., Aizawa,K., Nagaoka,S., Sasaki, N., 
Konno,H., Akiyama,J., Nishi,K., Kitsunai,T., Tashiro,H. 
Sumi,N., Ishii,Y., Nakamura,S., Hazama,M., Nishine,T. 
Yamamoto,R., Matsumoto, H. , Sakaguchi, S . , Ikegami,T., Kashiwagi, K. , 
Fujiwake,S., Inoue,K., Togawa,Y., Izawa,M., Ohara,E., Watahiki,M., 
Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., Matsuura,S., Kawai,J. 
Okazaki,Y., Muramatsu, M. , Inoue,Y., Kira,A. and Hayashizaki , Y . 
RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer 
Genome Res. 10 (11), 1757-1771 (2000) 
20530913 
11076861 
4 

The RIKEN Genome Exploration Research Group Phase II Team and the 
FANTOM Consortium. 

Functional annotation of a full-length mouse cDNA collection 

Nature 409, 685-690 (2001) 

5 

The FANTOM Consortium and the RIKEN Genome Exploration Research 
Group Phase I & II Team. 

Analysis of the mouse transcriptome based on functional annotation 
of 60,770 full-length cDNAs 
Nature 420, 563-573 (2002) 
6 (bases 1 to 3568) 

Adachi,J., Aizawa,K., Akimura,T., Arakawa,T., Bono,H., Carninci, P. 
Fukuda,S., Furuno,M., Hanagaki,T., Hara,A., Hashizume, W. , 
Hayashida, K. , Hayatsu,N., Hiramoto,K., Hiraoka,T., Hirozane,T., 
Hori,F., Imotani,K., Ishii,Y., Itoh,M., Kagawa,!., Kasukawa,T., 
Katoh,H., Kawai,J., Kojima,Y., Kondo,S., Konno,H., Kouda,M., 
Koya,S., Kurihara,C, Matsuyama, T . , Miyazaki,A., Murata,M. , 
Nakamura,M., Nishi,K., Nomura, K. , Numazaki,R., Ohno,M., Ohsato,N., 
Okazaki,Y., Saito,R., Saitoh, H., Sakai,C, Sakai,K., Sakazume,N., 
Sano,H., Sasaki, D., Shibata,K., Shinagawa, A. , Shiraki,T., 
Sogabe,Y., Tagami,M., Tagawa,A. , Takahashi , F. , Takaku-Akahira, S . , 



Takeda,Y., Tanaka,T., Tomaru,A., Toya,T., Yasunishi,A. , 
Muramatsu,M. and Hayashizaki, Y. 
TITLE Direct Submission 

JOURNAL Submitted ( 16-JUL-2001) Yoshihide Hayashizaki, The Institute of 
Physical and Chemical Research (RIKEN) , Laboratory for Genome 
Exploration Research Group, RIKEN Genomic Sciences Center (GSC) , 
RIKEN Yokohama Institute; 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 
Kanagawa 230-0045, Japan (E-mail : genome-res @gsc . riken . go . jp, 
URL: http: //genome. gsc. riken. go. jp/, Tel : 81-4 5-503-9222, 
Fax: 81-45-503-9216) 
COMMENT cDNA library was prepared and sequenced in Mouse Genome 

Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 

Please visit our web site for further details. 
URL : http : / / genome . gsc . riken. go . jp/ 
URL: http : / / f antom. gsc. riken. go . jp/ . 
FEATURES Location/Qualifiers 
source 1. .3568 

/organism="Mus musculus" 

/mol_type="mRNA" 

/strain="C57BL/6J" 

/ db_xref ="FANTOM_DB: D130061K05" 

/db_xref="MGI : 2419956" 

/db_xref="taxon: 10090" 

/clone="D13006lK05" 

/tissue_type-"spinal ganglion" 

/clone_lib="RIKEN full-length enriched mouse cDNA library" 
/dev_stage="12 days embryo" 
CDS 147. .2990 

/note="unnamed protein product; MEGF11 PROTEIN (KIAA1781) 

homolog [Homo sapiens] (SPTR | Q96KG6, evidence: FASTY, 

85.9%ID, 85.4%length, match-2363) 

putative" 

/ codon_start=l 

/protein_id="BAC34702. 1" 

/db_xref="GI: 26342080" 

/translation="MAPSAVGLLVFLLQAALALNPEDPNVCSHWESYAVTVQESYAHP 
FDQIYYTRCADILNWFKCTRHRISYKTAYRRGLRTMYRRRSQCCPGYYENGDFCIRCD 
SEHWGPHCSNRCQCQNGALCNPITGACVCAPGFRGWRCEELCAP.GTHGKGCQLLCQCH 
HGASCDPRTGECLCAPGYTGVYCEELCPPGSHGAHCELRCPCQNGGTCHHITGECACP 
PGWTGAVCAQPCPPGTFGQNCSQDCPCHHGGQCDHVTGQCHCTAGYMGDRCQEECPFG 
TFGFLCSQRCDCHNGGQCSPATGACECEPGYKGPSCQERLCPEGLHGPGCTLPCPCDT 
ENTISCHPVTGACTCQPGWSGHYCNESCPAGYYGNGCQLPCTCQNGADCHSITGSCTC 
APGFMGEVCAVPCAAGTYGPNCSSVCSCSNGGTCSPVDGSCTCREGWQGLDCSLPCPS 
GTWGLNCNETCICANGAACSPFDGSCACTPGWLGDSCELPCPDGTFGLNCSEHCDCSH 
ADGCDPVTGHCCCLAGWTGIRCDSTCPPGRWGPNCSVSCSCENGGSCSPEDGSCECAP 
GFRGPLCQRICPPGFYGHGCAQPCPLCVHSRGPCHHISGICECLPGFSGALCNQVCAG 
GHFGQDCAQLCSCANNGTCSPIDGSCQCFPGWIGKDCSQGCPSAFFGKDCGHICQCQN 
GASCDHITGKCTCRTGFSGRHCEQRCAPGTFGYGCQQLCECMNNATCDHVTGTCYCSP 
GFKGIRCDQAALMMDELNPYTKISPALGAERHSVGAVTGIVLLLFLVWLLGLFAWRR 
RRQKEKGRDLAPRVSYTPAMRMTSTDYSLSDLSQSSSHAQCFSNASYHTLACGGPATS 
QASTLDRNSPTKLSNKSLDRDTAGWTPYSYVNVLDSHFQISALEARYPPEDFYIELRH 
LSRHAEPHSPGTCGMDRRQNTYIMDKGFKVAPA" 

ORIGIN 



Query Match 28.4%; Score 972.6; DB 11; Length 3568; 

Best Local Similarity 65.4%; Pred. No. 1.4e-266; 

Matches 1495; Conservative 0; Mismatches 699; Indels 93; Gaps 1; 

Qy 74 CT CT GAAT CTT GAAGAC CCT AAT GT GT GT AGCCACTGGGAAAGCTACT CAGT GACT GT GC 133 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I II I I I I I I I I I I I I 

Db 199 CT CT GAAC CCT GAAGACC C CAAT GT GT GT AG C CACT GG GAGAGCT ATG C C GT GACT GT G C 258 

Qy 134 AAGAGT CAT AC C CAC AT C C CT T T GAT CAAAT T T AC T AC AC GAG C T GC AC T GAC AT T CTAA 193 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I III I I I I I I I I I 
Db 259 AGGAGTCTTAT GCACACCCCTTTGAT CAGATCTACTACACACGAT GTGCAGACATCCT CA 318 

Qy 194 ACTGGTTTAAATGCACGCGGCACAGAGTCAGCTATCGGACAGCCTATCGACATGGGGAGA 253 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 319 ACTGGTTCAAGTGTACCCGGCACCGGATCAGCTATAAGACCGCGTATAGGCGCGGCCTCC 378 

Qy 254 AGACTATGTATAGGCGCAAGTCTCAGTGTTGTCCTGGATTTTATGAAAGCGGGGAAATGT 313 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 37 9 GGACCATGTACCGGCGGAGGTCCCAATGCTGCCCTGGCTACTATGAGAACGGAG 432 

Qy 314 GTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCTCCAAACACCTGTC 373 

Db 433 432 

Qy 37 4 AGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGATGGTGATCACTGGG 4 33 

I I I I I I I I I I I I I I I I I I I I I I I 
Db 433 ACTTCTGCATTCGCTGTGACAGCGAGCACTGGG 465 

Qy 434 GTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGCAACCCCATCACCG 4 93 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 4 66 GTCCCCACTGCAGCAACCGGTGTCAGTGTCAGAACGGCGCCCTGTGCAACCCTATCACCG 525 

Qy 4 94 GGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGACCGCTGTGAGCAGG 553 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 526 GCGCCTGCGTGTGCGCCCCGGGCTTCCGAGGCTGGCGCTGTGAGGAACTCTGCGCTCCTG 585 

Qy 554 GCACC TAT GGT AAC GACT GT CAT CAGAGAT GC CAGT GC CAGAAT GGAGC CAC CT GC GAC C 613 

III I II II I III II I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 58 6 GTACTCACGGCAAGGGCTGCCAGCTGCTCTGTCAGTGCCACCATGGCGCCAGCTGTGACC 645 

Qy 614 ACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTCTGTGAGGATCTTT 673 

III II II I I I I I I I I I II II I I I I I II I I I I I I I I I I I II I 

Db 64 6 CGCGCACTGGCGAGTGCCTCTGCGCTCCTGGCTACACAGGCGTTTACTGTGAGGAGCTGT 7 05 

Qy 674 GT CCT CCT GGT AAACAT GGT CCACAGT GTGAGCAGAGAT GCC CTT GT CAAAAT GGAGGAG 733 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 7 06 GCCCCCCTGGGAGCCATGGAGCTCACTGTGAGCTGCGCTGCCCCTGCCAGAATGGAGGCA 7 65 

Qy 734 TGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGATGGGCACAGTGTGTG 7 93 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 766 CCTGCCACCACATCACTGGCGAATGTGCCTGCCCTCCAGGCTGGACGGGAGCAGTGTGTG 825 

Qy 794 GTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAATGCCAGTGCCATA 853 

I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 82 6 CCCAGCCCTGCCCTCCAGGGACCTTTGGCCAGAACTGTAGCCAGGACTGTCCCTGCCACC 8 85 

Qy 854 AT GGAGGGACGT GT GATGCT GCCACAGGCCAAT GT CATT GCAGT CCAGGAT ACACAGGGG 913 



Db 


886 


Qy 


914 


Db 


946 


Qy 


974 


Db 


1006 


Qy 


1034 


Db 


1066 


Qy 


1094 


Db 


1126 


Qy 


1154 


Db 


1186 


Qy 


1214 


Db 


1246 


Qy 


1274 


Db 


1306 


Qy 


1334 


Db 


1366 


Qy 


1394 


Db 


1426 


Qy 


1454 


Db 


1486 


Qy 


1514 


Db 


1546 


Qy 


1574 


DD 


IDUO 


Qy 


1634 


Db 


1666 


Qy 


1694 



I II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 I I 1 1 1 1 1 II 1 1 1 1 

AT G GAGGC C AGTGT GAC CAT GT GACT GGACAAT GC CACT GT ACAGCT G GAT ACAT GG GGG 945 

AACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGTGCTGAGACCTGCC 973 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I III M I 

ACAGGTGTCAAGAAGAATGTCCCTTTGGAACGTTCGGTTTCCTGTGCTCTCAACGCTGTG 1005 



III I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



II I I I I I I I I I I I I I I I I I I I I I I I I I I I 



ACAAACGGTGTCCCTGCCACTTGGAAAACACTCATAGCTGTCACCCCATGTCTGGAGAGT 1153 

I | I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

CCTTGCCCTGCCCCTGTGACACCGAGAACACTATCAGCTGCCATCCAGTTACTGGAGCTT 1185 

GTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGTTCTCCTGGATTCT 1213 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

GTACCTGCCAACCAGGCTGGTCTGGCCACTACTGCAATGAGTCCTGCCCCGCCGGCTACT 124 5 

ACGGGGAAGCTT GCCAGCAGAT CT GCAGCT GCCAAAAT GGGGCAGACT GT GACAGT GTGA 1273 

III I I MINIM I I I I I I I I I I I I I I I II I I I I I I M I I I I 

AT G GCAACG GTT GC C AGC T AC C CT GC AC CT G CC AGAAC GGT GCT GACT GC CACAGT AT CA 1305 

CTGGAAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCTACCCCATGCCCTC 1333 

I I I I I I I I I I I I I I I I I I I I I II III I II I I I I I I I I 

CCGGGAGCTGCACTTGTGCTCCAGGCTTCATGGGAGAGGTGTGTGCCGTCCCCTGTGCTG 1365 

TGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCT 1393 

I I || I I I I II I I II I I I I I I I I I I I I I I I I III I I M I 

CAGGGACCTATGGTCCCAACTGTTCATCTGTATGTAGCTGTAGCAACGGCGGCACCTGTT 1425 

CTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCA 1453 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 

CCCCAGTGGATGGCTCCTGCACCTGCCGAGAGGGATGGCAGGGCCTGGACTGCTCCCTGC 1485 

GATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGG 1513 

III I I I I I I I I I II I I M II M I II II I I I I I I 

CTTGTCCCAGTGGGACCTGGGGCCTGAACTGCAATGAGACTTGCATCTGTGCCAATGGAG 1545 

GAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAAT 157 3 

I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 

CTGCCTGCAGCCCCTTTGATGGGTCCTGTGCCTGCACCCCAGGCTGGCTGGGGGACTCCT 1605 

GCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCA 1633 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I M I I I 

GTGAACTGCCCTGCCCGGACGGCACTTTTGGGCTGAACTGCAGTGAGCATTGCGACTGCA 1665 

GCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCGGGATGGTCAG 1693 

I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I M 

GCCATGCTGATGGCTGTGACCCTGTCACAGGCCACTGCTGCTGCCTGGCAGGATGGACAG 1725 

GTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCT 1753 
I I I I I II I I I I M I I I I I I I II I I I I I I I I I I I I I I I I I I 



Db 1726 GCATCCGCTGTGATAGCACGTGTCCTCCAGGTCGCTGGGGCCCCAACTGTTCAGTGTCCT 1785 

Qy 1754 GCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCACCAG 1813 

II I I I I I II II I III I I I I I I I I II II II I MINIMUM II I 

Db 178 6 GCAGCTGTGAGAACGGAGGTTCCTGCTCCCCGGAGGACGGGAGCTGCGAGTGTGCCCCTG 1845 

Qy 1814 GCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCA 1873 

I I I I I I I I I II I I I I I I I I I I I I I I I II II I I I I I I I I I I M I I 

Db 1846 GCTTTCGAGGACCCTTATGTCAGAGAATCTGCCCACCAGGATTCTACGGCCATGGCTGCG 1905 

Qy 1874 GC C AGAC AT G C C C AC AGT GC GT T C ACAGCAG C GG GC C CT GC C AC C AC AT CAC C GG C CT GT 1933 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II II 
Db 1906 CCCAGCCTTGTCCCCTCTGCGTGCACAGCAGGGGGCCCTGCCACCACATCAGTGGTATCT 1965 

Qy 1934 GTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCAGAT 1993 

I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I ! 

Db 1966 GT GAGT GCCT GC CAGGATT CT CT GGAGCCTT GTGCAAC CAAGT GTGT GCT GGAGGGCACT 2025 

Qy 1994 T T G G G AAAAAC T GT G C AG G AAT T T GT AC C T G CAC C AAC AAC G G AAC C T GT AAC C C CAT T G 2053 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2026 TCGGGCAGGACTGTGCCCAGCTCTGTTCCTGTGCCAACAATGGGACCTGCAGCCCCATCG 2 085 

Qy 2 054 AC AGAT CT T GT C AGT GT T AC C C C GGT T GGAT T GGCAGT GAC T G CT CT CAAC CAT GT C CAC 2113 

I | I I | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 

Db 2086 ATGGCTCCTGTCAGTGCTTCCCTGGGTGGATTGGCAAGGACTGCTCACAGGGTTGCCCTT 2145 

Qy 2114 CTGCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCTGCA 2173 

III I II I I I I I I I I I I I I II I I Ml 

Db 214 6 CAGCAT TT T TT GGGAAG GACT GT GG GCACAT AT GC C AGT GT CAGAAT G GAGC C AG CT GT G 2205 

Qy 2174 GCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTCTACTGCACTCAGA 2233 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

Db 220 6 ACCACATCACTGGGAAATGCACCTGTCGAACAGGCTTCTCTGGCCGCCACTGTGAACAGA 2265 

Qy 2234 GAT GT C CT CT AGGGT T T TAT GGAAAAGAT T GT G CACT GAT AT GCCAAT GT CAAAAC GGAG 2293 

I I I I I I I I I I I I I I I I I I I I I I I I! I I I III I 

Db 2266 GAT GT G CC C CT G GAACCTT T GGATAT GGGT GT CAG C AG CTAT GT GAGT G CAT GAACAAT G 2325 

Qy 2294 CT GACT GCGACCACAT TT CT GGGCAGT GT ACT T GCCGC ACT GGAT T CAT GGGAC GGCACT 2353 

I I I I I I I I I I I I I I I M II I I I I I I I I I I III I 

Db 232 6 C CACT T GT GAC C ACGT CACT GGT AC CT GT T ACT GT AGC C CGGGAT T CAAAGGAAT CAGGT 2385 

Qy 2354 GTGAGCA 2360 

I I I I I I 

Db 2386 GTGACCA 2392 



RESULT 4 
AY406556 

LOCUS AY406556 2910 bp DNA linear GSS 12-DEC-2003 

DEFINITION Mus musculus HCM2596 gene, VIRTUAL TRANSCRIPT, partial sequence, 

genomic survey sequence. 

ACCESSION AY406556 

VERSION AY406556.1 GI: 39762527 

KEYWORDS GSS. 

SOURCE Mus musculus (house mouse) 

ORGANISM Mus musculus 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 
JOURNAL 

COMMENT 

FEATURES 

source 



gene 
ORIGIN 



Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus . 

1 (bases 1 to 2910) 

Clark, A. G., Glanowski , S . , Nielson,R., Thomas, P., Kejariwal, A. , 
Todd, M. A., Tanenbaum, D.M. , Civello, D . R. , Lu,F., Murphy, B., 
Ferriera,S., Wang,G., Zheng, X.H., White, T. J., Sninsky, J. J. , 
Adams, M.D. and Cargill,M. 

Inferring nonneutral evolution from human- chimp-mouse orthologous 
gene trios 

Science 302 (5652), 1960-1963 (2003) 
14671302 

2 (bases 1 to 2910) 

Clark, A. G. , Glanowski, S . , Nielson,R., Thomas, P., Kejariwal, A. , 
Todd, M. A., Tanenbaum, D.M. , Civello, D . R. , Lu,F., Murphy, B., 
Ferriera,S., Wang,G., Zheng, X.H., White, T. J., Sninsky, J . J . , 
Adams, M.D. and Cargill,M. 
Direct Submission 

Submitted ( 16-NOV-2003 ) Celera Genomics, 45 West Gude Drive, 
Rockville, MD 20850, USA 

This sequence was made by sequencing genomic exons and ordering 
them based on alignment. 

Location/ Qualifiers 

1. .2910 

/organism="Mus musculus" 
/mol_type="genomic DNA" 
/db_xref="taxon: 10090" 
<1. .>2910 

/locus_tag="HCM2596" 



Query Match 27.3%; 
Best Local Similarity 56.2%; 
Matches 1360; Conservative 



Score 933.2; DB 29; 
Pred. No. 2.5e-255; 
0; Mismatches 1052; 



Length 2910; 



Indels 



6; Gaps 



2; 



Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



318 



377 



CCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCTCCAAACACCTGTCAGTG 

III I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I 

75 CGCTCTGTGTACCGAGGAGTGCATGCACGGCCGCTGTGTCTCTCCCGATACCTGCCACTG 134 

37 8 TGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGATGGTGATCACTGGGGTCC 4 37 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I 

135 TGAGCCTGGATGGGGAGGCCCTGACTGCTCCAGCGNNNNNNNNNNNNNNNNNNNNNNNNN 194 

438 CCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGCAACCCCATCACCGGGGC 497 

195 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 254 

4 98 TTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGACCGCTGTGAGCAGGGCAC 557 

255 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 314 

558 CT AT GGTAACGACT GT CAT CAGAGAT GC CAGT GC CAGAAT GGAGC CAC CT GC GAC CAC GT 617 

315 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 374 

618 CACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTCTGTGAGGATCTTTGTCC 677 

I I I I I I I I I I I I I I 
375 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCTGTGAGGAACTCTGCGC 434 



Qy 
Db 



678 
435 



737 
494 



Qy 738 TCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGATGGGCACAGTGTGTGGTCA 797 

III I I I I I I I I I II I I I I I I I I I I I I 

Db 495 TGACCCGCGCACTGGCGAGTGCCTCTGCGCTCCTGGCTACACANNNNNNNNNNNNNNNNN 554 

Qy 798 GCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAATGCCAGTGCCATAATGG 857 

Db 555 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 614 

Qy 858 AGGGACGTGTGATGCTGCCACAGGCCAATGTCATTGCAGTCCAGGATACACAGGGGAACG 917 

Db 615 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 67 4 

Qy 918 GTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGTGCTGAGACCTGCCAGTG 977 

I I I I I II I I I I I II I I I I I I I II I I I I I III III III 

Db 675 GTGTCAAGAAGAATGTCCCTTTGGAACGTTCGGTTTCCTGTGCTCTCAACGCTGTGACTG 734 

Qy 978 TGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTCTGTGAAGCAGGCTTTGC 1037 

I I I I I I I I I I I I I I I I II II II I I I I I I I I I I 

Db 735 CCACAAT GGAGGT CAAT GTT CACCAGCCACAGGGGCCTGT GAGTGTGAGCCTGGCTACAA 794 

Qy 1038 TGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAATGTGACAA 1097 

I II I I I I I I I I I I I I I I I I I I I I I I I I I II II I 

Db 795 GGGCCCTAGCTGCCAGGAGCGGCTATGCCCTGAGGGCCTGCATGGCCCAGGCTGCACCTT 854 

Qy 1098 ACGGTGTCCCTGCCACTTGGAAAACACTCATAGCTGTCACCCCATGTCTGGAGAGTGTGC 1157 

I I I I I I I I II I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I 

Db 855 GC C CT GC CC CT GT GACACC GAGAACACT AT CAG CT GC C AT CCAGTT ACT GGAGCT T GT AC 914 

Qy 1158 CTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGTTCTCCTGGATTCTACGG 1217 

II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 915 CTGCC7VACCAGGCTGGTCTGGCCACTACTGCAATGAGTCCTGCCCCGCCGGCTACTATGG 974 

Qy 1218 GGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGACTGTGACAGTGTGACTGG 1277 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 975 CAACGGTTGCCAGCTACCCTGCACCTGCCAGAACGGTGCTGACTGCCACAGTATCACCGG 1034 

Qy 1278 AAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCTACCCCATGCCCTCTGGG 1337 

I I I I I I I I I I I I I I II I I I I III III I I I I I I II II 

Db 1035 GAGCTGCACTTGTGCTCCAGGCTTCATGGGAGAGGTGTGTGCCGTCCCCTGTGCTGCAGG 1094 

Qy 1338 AACCT AT GGGATAAACT GTT CCT CT CGCTGT GGCTGTAAAAAT GAT GCAGT CT GCT CT CC 1397 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I 

Db 1095 GACCTATGGTCCCAACTGTTCATCTGTATGTAGCTGTAGCAACGGCGGCACCTGTTCCCC 1154 

Qy 1398 TGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGATG 14 57 

I I I I I I I' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 1155 AGTGGATGGCTCCTGCACCTGCCGAGAGGGATGGCAGGGCCTGGACTGCTCCCTGCCTTG 1214 

Qy 1458 TCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGGGAGC 1517 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II 

Db 1215 TCCCAGTGGGACCTGGGGCCTGAACTGCAATGAGACTTGCATCTGTGCCAATGGAGCTGC 1274 



Qy 1518 CTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAATGCGA 1577 

Mill I II I II III MM III I I I I I I I I I Mill M I I 

Db 1275 CTGCAGCCCCTTTGATGGGTCCTGTGCCTGCACCCCAGGCTGGCTGGGGGACTCCTGTGA 1334 

Qy 1578 ACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGCCA 1637 

I II M M I II II I I I M I I II II II II I M Mill I II II II M II I I I 

Db 1335 ACTGCCCTGCCCGGACGGCACTTTTGGGCTGAACTGCAGTGAGCATTGCGACTGCAGCCA 1394 

Qy 1638 CGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCGGGATGGTCAGGTGT 1697 

M I I I II I M II II I I II II I I I I M I II II M I I I II I I I I M I 

Db 1395 TGCTGATGGCTGTGACCCTGTCACAGGCCACTGCTGCTGCCTGGCAGGATGGACAGGCAT 1454 

Qy 1698 CCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGCTA 1757 

II I II II I I I I I I I I I I II I II II M II II II II I I I I M II II I 

Db 1455 CCGCTGTGATAGCACGTGTCCTCCAGGTCGCTGGGGCCCCAACTGTTCAGTGTCCTGCAG 1514 

Qy 1758 CTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCACCAGGCTT 1817 

MM I II M I III II M M M II M II I II II I M II II I II Mill 

Db 1515 CTGTGAGAACGGAGGTTCCTGCTCCCCGGAGGACGGGAGCTGCGAGTGTGCCCCTGGCTT 157 4 

Qy 1818 CCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGCCA 1877 

Mill M I I II I I II M I M I I M I I M M I I I I I I I II I III 

Db 1575 TCGAGGACCCTTATGTCAGAGAATCTGCCCACCAGGATTCTACGGCCATGGCTGCGCCCA 1634 

Qy 1878 GACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCACATCACCGGCCTGTGTGA 1937 

I I II I I I I II II I I I II I II M I M I I I II M II II M I II I II M I 

Db 1635 GCCTTGTCCCCTCTGCGTGCACAGCAGGGGGCCCTGCCACCACATCAGTGGTATCTGTGA 1694 

Qy 1938 CTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCAGATTTGG 1997 

II I II M I I II I I II II I I I II II II II I I II I III I I II 

Db 1695 GTGCCTGCCAGGATTCTCTGGAGCCTTGTGCAACCAAGTGTGTGCTGGAGGGCACTTCGG 1754 

Qy 1998 GAAAAACT GT GCAGGAAT T T GT AC CT G C AC C AAC AAC GGAAC CT GT AAC C C CAT T GAC AG 2 057 

II || II M I I I I I I I II I II I II I II II I II I M I II I M I 

Db 17 55 GCAGGACTGTGCCCAGCTCTGTTCCTGTGCCAACAATGGGACCTGCAGCCCCATCGATGG 1814 

Qy 2058 ATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCTCAACCATGTCCACCTGC 2117 

I I I M II I II I II I I I II II II III I I I I II II I I I I I I M I I II 

Db 1815 CTCCTGTCAGTGCTTCCCTGGGTGGATTGGCAAGGACTGCTCACAGGCCTGCCCATCTGG 1874 

Qy 2118 CCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCTGCAGCGC 217 7 

I II II I I I II II II I I I II I I II MUM II M M II I I M I II 

Db 1875 GTTCTGGGGCTCTGCCTGCTTCCACACATGCAGCTGCCACAACGGGGCGAGCTGCAGCGC 1934 

Qy 217 8 CT ACGAT GGGGAAT GT AAATGCACT CCT GGCTGGACAGGGCTCT ACT GCACT C AGAGAT G 2237 

I I II I I I I I II I I I II I I I I I M II II I I I II I I M M M M I I II 

Db 1935 CGAGGATGGGGCCTGCCACTGCACCCCTGGCTGGACTGGACTCTTCTGCACGCAGCGTTG 1994 

Qy 2238 T CCT CTAGGGTTTT ATGGAAAAGATT GT GCACT GATAT GCCAATGT CAAAAC GGAGCTGA 2297 

III II I II I I II I I I I I I M I I I I M I M I I M I I I I I M I 

Db 1995 CCCTTCAGCATTTTTTGGGAAGGACTGTGGGCACATATGCCAGTGTCAGAATGGAGCCAG 2054 

Qy 2298 CTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTCATGGGACGGCACTGTGA 2357 

|| | || | || || I I I I I I I I I I I I I II I I M II I II I I I I II I II I 

Db 2055 CTGTGACCACATCACTGGGAAATGCACCTGTCGAACAGGCTTCTCTGGCCGCCACTGTGA 2114 

Qy 2358 GCAGAA.GT GCCCTTCAGGAACATATGGCT ATGGCTGTCGCCAGATAT GT GATT GTCT GAA 2417 



1 1 1 1 II I I 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



Db 


2115 


ACAGAGAT GTGCCCCTGGAAC CTTT GGATAT GGGT GT CAGCAGCT ATGT GAGT GCATGAA 


2174 


Qy 

Db 


2418 
2175 


C AAC T C C AC CT GC GAC C ACAT C ACT G GGAC C T GT T AC T GCAGC C C C GGAT GGAAGGGAG C 

1 I I 1 1 1 1 II 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Mill 1 1 1 1 II IN 

CAATGCCACTTGTGACCACGTCACTGGTACCTGTTACTGTAGCCCGGGATTCAAAGGAAT 


2477 
2234 


Qy 


2478 


GAGAT GT GAT CAAGCTGGTGTT AT CATAGTT GGAAAT CT GAACAGCTTAAGCCGAACCAG 

II 1 1 1 1 1 111 1 1 1 1 1 1 M 1 1 1 1 

C AGGT GT GAC CAAGC T GC CCT CAT GAT GGAT G AG CT GAAT C C CT ACAC CAAGATCAG 


2537 


Db 


2235 


2291 


Qy 


2538 


TACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCAGGCATCATCATTCTTGT 

I I 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 III 1 

TCCAGCTCTGGGAGCAGAGCGGCACTCAGTGGGTGCTGTCACCGGCATCGTTCTCCTGTT 


2597 


Db 


2292 


2351 


Qy 


2598 


CCTAGTTGTTCTCTTCCTACTGGCATTGTTCATTATTTATAGACACAAGCAGAAGGGAAA 

I 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

GTTCCTGGTGGTGGTGCTGCTGGGCCTGTTTGCCTGGCGACGGAGGCGGCAGAAAGAGAA 


2657 


Db 


2352 


2411 


Qy 


O £ ^ P 

z bo o 


rraaTfaarrnTr rr ArrAr^TTArrTAnACCCCTGCTATGAGGGTCGTCAATGCAGA 

| | M II Ml 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 

AGGCCGTGACCTGGCTCCCCGAGTCTCCTACACCCCAGCCATGAGGATGACCAGCACAGA 


2714 


Db 


2412 


2471 


Qy 


2715 


T T AT ACC AT T T CAGGAAC 2732 

II 1 1 1 1 1 1 1 1 1 
CT ACTCT CT CT CAGGCAC 24 89 




Db 


2472 





RESULT 5 
AK032661 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 
PUBMED 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



AK032661 3466 bp mRNA linear HTC 18-SEP-2003 

Mus musculus 10 days neonate cerebellum cDNA, RIKEN full-length 
enriched library, clone : 6530404N23 product :MEGF11 PROTEIN 
(KIAA1781) homolog [Homo sapiens], full insert sequence. 
AK032661 

AK032 661. 1 GI: 2 6082960 
HTC; CAP trapper. 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chorda ta; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 

Carninci,P. and Hayashizaki, Y . 

High-efficiency full-length cDNA cloning 

Meth. Enzymol. 303, 19-44 (1999) 

99279253 

10349636 

2 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 

Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki, Y. 

Normalization and subtraction of cap-trapper-selected cDNAs to 

prepare full-length cDNA libraries for rapid discovery of new genes 

Genome Res. 10 (10), 1617-1630 (2000) 

20499374 

11042159 

3 

Shibata,K., Itoh,M., Aizawa,K., Nagaoka,S., Sasaki, N., Carninci,P., 



Konno,H., Akiyama,J., Nishi,K., Kitsunai,T., Tashiro,H., Itoh,M., 
Sumi,N., Ishii,Y., Nakamura, S . , Hazama,M., Nishine,T., Harada r A., 
Yamamoto f R., Matsumoto, H . , Sakaguchi , S . , Ikegami,T., Kashiwagi, K. , 
Fujiwake,S., Inoue,K., Togawa,Y., Izawa,M., Ohara,E., Watahiki,M., 
Yoneda,Y., Ishikawa, T . , Ozawa,K., Tanaka,T., Matsuura,S., Kawai,J., 
Okazaki,Y., Muramatsu,M. , Inoue,Y., Kira,A. and Hayashizaki , Y . 

TITLE RIKEN integrated sequence analysis (RISA) system — 384-format 

sequencing pipeline with 384 multicapillary sequencer 

JOURNAL Genome Res. 10 (11), 1757-1771 (2000) 

MEDLINE 20530913 
PUBMED 11076861 
REFERENCE 4 

AUTHORS The RIKEN Genome Exploration Research Group Phase II Team and the 
FANTOM Consortium. 

TITLE Functional annotation of a full-length mouse cDNA collection 

JOURNAL Nature 409, 685-690 (2001) 
REFERENCE 5 

AUTHORS The FANTOM Consortium and the RIKEN Genome Exploration Research 
Group Phase I & II Team. 

TITLE Analysis of the mouse transcriptome based on functional annotation 

of 60,770 full-length cDNAs 

JOURNAL Nature 420, 563-573 (2002) 
REFERENCE 6 (bases 1 to 3466) 

AUTHORS Adachi,J., Aizawa,K., Akimura,T., Arakawa,T., Bono,H., Carninci,P., 
Fukuda,S., Furuno,M., Hanagaki,T., Hara,A., Hashizume, W . , 
Hayashida,K. , Hayatsu,N., Hiramoto f K., Hiraoka,T., Hirozane,T., 
Hori,F., Imotani,K., Ishii,Y., Itoh,M. , Kagawa,I., Kasukawa,T., 
Katoh,H., Kawai,J., Kojima,Y., Kondo,S., Konno,H., Kouda,M., 
Koya,S., Kurihara,C, Matsuyama, T . , Miyazaki,A., Murata,M., 
Nakamura, M., Nishi,K., Nomura, K., Numazaki,R., Ohno,M., Ohsato,N., 
Okazaki,Y., Saito,R., Saitoh, H., Sakai,C, Sakai,K., Sakazume,N., 
Sano,H., Sasaki,D., Shibata,K., Shinagawa, A. , Shiraki,T., 
Sogabe,Y., Tagami,M., Tagawa,A., Takahashi, F. , Takaku-Akahira, S . , 
Takeda,Y., Tanaka,T., Tomaru,A., Toya,T., Yasunishi , A. , 
Muramatsu,M. and Hayashizaki, Y . 

TITLE Direct Submission 

JOURNAL Submitted ( 16- JUL-2001 ) Yoshihide Hayashizaki, The Institute of 
Physical and Chemical Research (RIKEN), Laboratory for Genome 
Exploration Research Group, RIKEN Genomic Sciences Center (GSC) , 
RIKEN Yokohama Institute; 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 
Kanagawa 230-0045, Japan (E-mail : genome-res@gsc . riken . go . jp, 
URL : http: //genome. gsc. riken. go. jp/, Tel : 81-4 5-503-9222 , 
Fax:81-45-503-9216) 
COMMENT cDNA library was prepared and sequenced in Mouse Genome 

Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 

Please visit our web site for further details. 
URL : http : / /genome . gs c . riken . go . j p/ 
URL: http : //fantom. gsc . riken . go . jp/ . 
FEATURES Location/Qualifiers 
source 1. .3466 

/organism="Mus musculus" 

/mol_type="mRNA" 

/strain="C57BL/6J" 

/db xref=" FANTOM DB : 6530404N23" 



misc feature 



/db_xref="MGI : 2396257" 
/db_xref="taxon: 10090" 
/clone="6530404N23" 
/tissue_type=" cerebellum" 

/clone__lib="RIKEN full-length enriched mouse cDNA library" 
/dev_stage="10 days neonate" 
1. .3466 

/ note="MEGFl 1 PROTEIN (KIAA1781) homolog [Homo sapiens] 
(SPTRIQ96KG6, evidence: FASTY, 85.9%ID, 85.4%length, 
match=2363) " 



ORIGIN 



Query Match 26.1%; 
Best Local Similarity 69.1%; 
Matches 1219; Conservative 



Score 892.6; DB 11; 
Pred. No. 1.3e-243; 
0; Mismatches 544; 



Length 3466; 
Indels 0; Gaps 



0; 



Qy 

Db 



94 AAT GT GT GT AGC CACT GG GAAAG CT ACT CAGT GACT GT GCAAGAGT CAT AC C CAC AT C C C 153 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 
1 AAT GT GT GT AGCCACT G GGAGAGCT AT GC C GT GACT GT GC AT GAGT C TT AT GC AC AC CC C 60 



Qy 

Db 

Qy 

Db 



154 



61 



214 



TTTGATCAAATTTACTACACGAGCTGCACTGACATTCTAAACTGGTTTAAATGCACGCGG 
I M I I I I I I I I I I I I I I I III I II I I I I I I I I I M I | I I M I I | I | 
T TT GAT CAGAT CTACTACAC AC GAT GT GCAGACAT C CTCAACT GGT T CAAGT GTAC C CGG 



C AC AGAGT C AGCTAT C GGAC AG C CT AT C GACAT GGGGAGAAGACT AT GT AT AGGC GCAAG 
I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I 

121 CACCGGATCAGCTATAAGACCGCGTATAGGCGCGGCCTCCGGACCATGTACCGGCGGAGG 



213 



120 



273 



180 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



274 TCTCAGTGTTGTCCTGGATTTTATGAAAGCGGGGAAATGTGTGTCCCCCACTGTGCTGAT 333 

II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

181 TCCCAATGCTGCCCTGGCTACTATGAGAACGGAGACTTCTGCATTCCTCTGTGTACCGAG 240 

334 AAATGTGTCCATGGTCGCTGTATTGCTCCAAACACCTGTCAGTGTGAGCCTGGCTGGGGA 393 

Ml I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

241 GAGTGCATGCACGGCCGCTGTGTCTCTCCCGATACCTGCCACTGTGAGCCTGGATGGGGA 300 

394 GGGACCAACTGCTCCAGTGCCTGCGATGGTGATCACTGGGGTCCCCACTGCACCAGCCGG 453 

II I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I II I I II 

301 GGCCCTGACTGCTCCAGCGGCTGTGACAGCGAGCACTGGGGTCCCCACTGCAGCAACCGG 360 



454 



361 



514 



TGCCAGTGCAAAAATGGGGCTCTGTGCAACCCCATCACCGGGGCTTGCCACTGTGCTGCG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
TGTCAGTGTCAGAACGGCGCCCTGTGCAACCCTATCACCGGCGCCTGCGTGTGCGCCCCG 



GGCTTCCGGGGCTGGCGCTGCGAGGACCGCTGTGAGCAGGGCACCTATGGTAACGACTGT 
I I I I I I I I I I I I I I I I I I I I I I M I I I I I | | | | I I I I I I I I I I 
421 GGCTTCCGAGGCTGGCGCTGTGAGGAACTCTGCGCTCCTGGTACTCACGGCAAGGGCTGC 

574 CAT CAGAGAT GCCAGT GCCAGAAT GGAGCCACCT GCGAC CACGT CACGGGGGAAT GC CGC 
I I I I I I I I I I I I I I II I I I II I I I I II I I I I I I I I I I I I I I 

481 CAGCTGCTCTGTCAGTGCCACCATGGCGCCAGCTGTGACCCGCGCACTGGCGAGTGCCTC 

634 TGCCCACCAGGATACACCGGAGCCTTCTGTGAGGATCTTTGTCCTCCTGGTAAACATGGT 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
541 TGCGCTCCTGGCTACACAGGCGTTTACTGTGAGGAGCTGTGCCCCCCTGGGAGCCATGGA 



513 



420 



573 



480 



633 



540 



693 



600 



Qy 



694 C C ACAGT GT GAGCAGAGAT GC C CT T GT C AAAAT GGAGGAGT GT GT CAT CAC GT CACT GGA 753 



Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



I II 1 1 I 1 1 I I I I I I 1 1 1 II II I I I I 1 1 I I II II III 1 1 I I I I I 

601 GCTCACTGTGAGCTGCGCTGCCCCTGCCAGAATGGAGGCACCTGCCACCACATCACTGGC 660 



754 GAATGCTCTTGCCCTTCTGGCTGGATGGGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGT 

I I I I I | I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I M 

661 GAATGTGCCTGCCCTCCAGGCTGGACGGGAGCAGTGTGTGCCCAGCCCTGCCCTCCAGGG 

814 CGCTTT GGAAAGAACT GTT CCCAAGAATGC CAGT GCCATAATGGAGGGAC GTGT GAT GCT 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

721 ACCTTT GGCCAGAACT GTAGCCAGGACTGT CCCT GC CACCATGGAGGCCAGTGT GACCAT 

874 GCCACAGGCCAATGTCATTGCAGTCCAGGATACACAGGGGAACGGTGCCAGGATGAGTGT 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

781 GT GACT G GACAAT GC C ACT GT ACAGCT GGAT AC AT GGGGGACAG GT GTCAAGAAGAAT GT 



934 



841 



994 



901 



CCTGTTGGGACCTATGGCGTTCTCTGTGCTGAGACCTGCCAGTGTGTCAACGGAGGGAAG 

II I I II I I I II I I I I I Ml Ml Ml I I I I I I I I I 

CCCTTTGGAACGTTCGGTTTCCTGTGCTCTCAACGCTGTGACTGCCACAATGGAGGTCAA 



813 



720 



873 



780 



933 



840 



993 



900 



TGTTACCACGTGAGCGGCGCATGCCTCTGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAA 1053 

I I I I I I I I I I I I I I I I I I I I I I I Ml I I I I I I 

TGTTCACCAGCCACAGGGGCCTGTGAGTGTGAGCCTGGCTACAAGGGCCCTAGCTGCCAG 960 



1054 GCACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAATGTGACAAACGGTGTCCCTGCCAC 1113 

I I I I I I I I I I I I I I I II I Ml II I I I I I I I I I II 

961 GAGCGGCTATGCCCTGAGGGCCTGCATGGCCCAGGCTGCACCTTGCCCTGCCCCTGTGAC 1020 

1114 TT GGAAAAC ACT CAT AGCT GT CACCC CAT GT CT GGAGAGT GT GC CT G CAAGC C GGGCT GG 1173 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

1021 ACCGAGAACACTATCAGCTGCCATCCAGTTACTGGAGCTTGTACCTGCCAACCAGGCTGG 1080 

1174 TCAGGACTCTACTGTAATGAGACATGTTCTCCTGGATTCTACGGGGAAGCTTGCCAGCAG 1233 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1081 TCTGGCCACTACTGCAATGAGTCCTGCCCCGCCGGCTACTATGGCAACGGTTGCCAGCTA 1140 

1234 AT CT GCAGCTGC CAAAAT GGGGCAGACT GT GAC AGT GT GACT GGAAAGT G CAC CT GTGC C 12 93 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
1141 CCCTGCACCTGCCAGAACGGTGCTGACTGCCACAGTATCACCGGGAGCTGCACTTGTGCT 1200 

1294 CCAGGATTCAAAGGAATTGACTGCTCTACCCCATGCCCTCTGGGAACCTATGGGATAAAC 1353 

I I I M I I I I III I II I I I I I I II I I I I I I I I II III 

1201 CCAGGCTTCATGGGAGAGGTGTGTGCCGTCCCCTGTGCTGCAGGGACCTATGGTCCCAAC 12 60 

1354 TGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCTCCTGTGGACGGGTCTTGT 1413 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 

1261 TGTTCATCTGTATGTAGCTGTAGCAACGGCGGCACCTGTTCCCCAGTGGATGGCTCCTGC 1320 

1414 ACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGATGTCCCAGTGGCACATGG 1473 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1321 ACCTGCCGAGAGGGATGGCAGGGCCTGGACTGCTCCCTGCCTTGTCCCAGTGGGACCTGG 1380 

1474 GGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGGGAGCCTGCAACACCCTGGAC 1533 

I M I I I I I I I I I I I II I I I M I I I I I I I I I II I I I 

1381 GGCCTGAACTGCAATGAGACTTGCATCTGTGCCAATGGAGCTGCCTGCAGCCCCTTTGAT 1440 

1534 GGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAATGCGAACTTCCCTGCCAGGAT 1593 
I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I 



Db 



1441 GGGTCCTGTGCCTGCACCCCAGGCTGGCTGGGGGACTCCTGTGAACTGCCCTGCCCGGAC 1500 



Qy 1594 GGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGCCACGCAGATGGCTGCCAC 1653 

I I I I I I I I I I I I I I I I I Mill I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 1501 GGCACTTTTGGGCTGAACTGCAGTGAGCATTGCGACTGCAGCCATGCTGATGGCTGTGAC 1560 

Qy 1654 CCTACCACGGGCCATTGCCGCTGCCTCCCGGGATGGTCAGGTGTCCACTGTGACAGCGTG 1713 

Ml I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 1561 CCTGTCACAGGCCACTGCTGCTGCCTGGCAGGATGGACAGGCATCCGCTGTGATAGCACG 1620 

Qy 1714 TGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGCTACTGTAAAAATGGGGCT 177 3 

I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1621 TGTCCTCCAGGTCGCTGGGGCCCCAACTGTTCAGTGTCCTGCAGCTGTGAGAACGGAGGT 1680 

Qy 1774 TCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCACCAGGCTTCCGAGGCACCACTTGT 1833 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II Ml 

Db 1681 TCCTGCTCCCCGGAGGACGGGAGCTGCGAGTGTGCCCCTGGCTTTCGAGGACCCTTATGT 1740 

Qy 1834 CAGAGGATCTGCTCCCCTGGTTT 1856 



1741 CAGAGAAGTAAGGCT CCT GATTT 1763 
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AY406555 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 
JOURNAL 

COMMENT 

FEATURES 

source 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Pan. 



AY406555 2910 bp DNA linear GSS 12-DEC-2003 

Pan troglodytes HCM2596 gene, VIRTUAL TRANSCRIPT, partial sequence, 
genomic survey sequence. 
AY406555 

AY406555. 1 GI: 39762526 
GSS. 

Pan troglodytes (chimpanzee) 
Pan troglodytes 
Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 

1 (bases 1 to 2910) 

Clark, A. G., Glanowski , S . , Nielson,R., Thomas, P., Kejariwal, A. , 
Todd, M. A., Tanenbaum, D.M. , Civello, D . R. , Lu,F., Murphy, B., 
Ferriera,S., Wang,G., Zheng, X.H., White, T . J. , Sninsky, J . J. , 
Adams, M.D. and Cargill,M. 

Inferring nonneutral evolution from human- chimp -mouse orthologous 
gene trios 

Science 302 (5652), 1960-1963 (2003) 
14671302 

2 (bases 1 to 2910) 

Clark, A. G., Glanowski, S. , Nielson,R., Thomas, P., Ke j ariwal , A. , 
Todd, M. A., Tanenbaum, D.M. , Civello, D . R. , Lu,F., Murphy, B., 
Ferriera,S., Wang,G., Zheng, X.H., White, T . J. , Sninsky, J. J. , 
Adams, M.D. and Cargill,M. 
Direct Submission 

Submitted ( 16-NOV-2003) Celera Genomics, 45 West Gude Drive, 
Rockville, MD 20850, USA 

This sequence was made by sequencing genomic exons and ordering 
them based on alignment. 

Location/Qualifiers 
1. .2910 

/organism="Pan troglodytes" 



/mol_type=" genomic DNA" 
/db_xref="taxon: 9598" 
gene <1. .>2910 

/locus_tag="HCM2596" 

ORIGIN 

Query Match 19.1%; Score 653.6; DB 29; Length 2910; 

Best Local Similarity 37.9%; Pred. No. 4.5e-175; 

Matches 916; Conservative 0; Mismatches 1493; Indels 6; Gaps 2; 



Qy 318 CCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCTCCAAACACCTGTCAGTG 377 
I 

Db 75 CGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 134 

Qy 378 TGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGATGGTGATCACTGGGGTCC 4 37 

Db 135 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 194 

Qy 4 38 CCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGCAACCCCATCACCGGGGC 4 97 

Db 195 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 254 

Qy 498 TTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGACCGCTGTGAGCAGGGCAC 557 

Db 255 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 314 

Qy 558 CT AT G GTAACGACT GT CAT CAGAGAT G C C AGT GC C AGAAT GGAGC CAC CT GC GAC CAC GT 617 

Db 315 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 374 

Qy 618 CACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTCTGTGAGGATCTTTGTCC 677 

III I I I I I II II II 

Db 375 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCTGCGAGGAGCTGTGCCC 434 

Qy 678 T CCT GGTAAAC AT G GT C C ACAGTGT GAG CAGAGAT GC C CTT GTCAAAAT GGAGGAGT GT G 7 37 

I I I I I I I I I I I I I I I I I I Ill I I I II I I I I I II 

Db 4 35 TCCTGGGAGCCATGGAGCTCACTGTGAGCTGCGCTGCCCCTGTCAGAATGGGGGCACCTG 4 94 

Qy 738 T CAT CAC GT CACT GGAGAAT GCT CTT GCCCTT CT GGCT GGAT GGGCACAGT GT GT GGT CA 797 

I I I I I I I I I I I I I I II I I I I I I II 

Db 4 95 CCACCACATCACTGGCGAGTGTGCCTGCCCCCCAGNCNGGACGNNNNNNNNNNNNNNNNN 554 

Qy 7 98 GCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAATGCCAGTGCCATAATGG 857 

Db 555 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 614 

Qy 8 58 AGGGAC GT GT GAT G CTGC CACAGG CCAAT GT CAT T GCAGT C CAGGAT ACACAGG GGAAC G 917 

Db 615 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 674 

Qy 918 GTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGTGCTGAGACCTGCCAGTG 977 

Db 675 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 734 

Qy 978 T GT CAACGGAGGGAAGT GTT ACCACGT GAGCGGCGCAT GCCT CT GT GAAGCAGGCT TT GC 1037 

Db 735 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 7 94 



1038 TGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAATGTGACAA 1097 

795 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 854 

1098 AC GGT GTCCCTGC C AC T T GGAAAAC ACT CAT AG CT GT C AC C C CAT GT C T GGAGAGT GT G C 1157 

855 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 914 

1158 CTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGTTCTCCTGGATTCTACGG 1217 

915 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 974 

1218 GGAAGCTT GCCAGCAGAT CTGCAGCT GCCAAAAT GGGGCAGACT GTGACAGT GT GACT GG 1277 

975 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 1034 

127 8 AAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCTACCCCATGCCCTCTGGG 1337 

1035 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 1094 

1338 AACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCTCC 1397 

1095 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 1154 

1398 TGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGATG 1457 

I I I I I I I I I I I I I I I I I I I III 
1155 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNGGTGGCAGGGCCTGGACTGCACCCTGCCATG 1214 

1458 TCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGGGAGC 1517 

I I I I I I I I I I I I I I I I I I I Ml III I III II I I I I! I I I I I 

1215 T CC CAGT GGGACAT GGGG C CT GAACT G CAAC GAGAGCT GCAC CT GT GC CAAT GG GGCAGC 1274 

1518 CTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAATGCGA 1577 

III I II I I I I! I I I I I I I I I I I I I I I I I I III I II II 

1275 CTGNNGCCCCATNGNCGGCTCCTGCTCCTGCACTCCTGGCTGGCTGGGAGNCACCTGTGA 1334 

1578 ACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGCCA 1637 

I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1335 GCTGCCTTGCCCGNNNNNNNNNNNNNNNNNNNNNNGCAGTGAACACTGTGACTGCAGCCN 1394 

1638 CGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCGGGATGGTCAGGTGT 1697 

I I I I I I I I I I I I I I I II 
1395 NNNNNNTGGATGTGACCCCGTCACAGGCNNNNNNNNNNNNNNNNNNNNNNNNNNNNGCAT 1454 

1698 CCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGCTA 1757 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

1455 CCACTGTGACAGCACGTGTCCACCTGGCCGCTGGGGCCCCAACTGCTCTGTCTCCTGCAG 1514 

1758 CTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCACCAGGCTT 1817 

I I I I I I I I I I I II I I I I I I I I II I I I I I I II I I I I I I I I I I II I I I I I 

1515 CTGTGAGAATGGAGGCTCCTGCTCCCCAGAGGATGGGAGCTGCGAGTGTGCCCCTGGCTT 1574 

1818 CCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGCCA 1877 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I III 
1575 CCGAGGACCCTTATGCCAGAGAATCTGCCCCCCTGGGTTCTATGGCCACGGCTGCGCCCA 1634 



Qy 187 8 GACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCACATCACCGGCCTGTGTGA 1937 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I 

Db 1635 GCCATGCCCCCTCTGCGTGCACAGCAGCGGGCCCTGCCACCACATCAGCGGCATCTGTGA 1694 

Qy 1938 CTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCAGATTTGG 1997 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I M I I 

Db 1695 GTGCCTCCCAGGATTCTCTGGAGCTCTCTGCAACCAAGTGTGTGCTGGAGGATACTTTGG 1754 

Qy 1998 GAAAAACT GT GCAGGAAT TT GTAC CT GCAC C AACAAC GGAAC CT GTAAC CC CAT T GACAG 2 057 

II I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1755 GCAGGACTGTGCCCAGCTCTGCTCCTGTGCCAACAACGGGACCTGCAGCCCTATCGATGG 1814 

Qy 2 058 ATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCTCAACCATGTCCACCTGC 2117 

II I II I I I I I I I I I I I I I I Ill 

Db 1815 CTCCTGCCAGTGCTTTCCTGGATGGATTGGCAAGGACTGCTCACAGGNNNNNNNNNNNNN 1874 

Qy 2118 CCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCTGCAGCGC 2177 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 1875 NNNNNNNNNNNNNNNNNNNNNCCACGCATGCAGCTGCCACAACGGGGCGAGCTGCAGCGC 1934 

Qy 217 8 CTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTCTACTGCACTCAGAGATG 2237 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I 

Db 1935 CGAGGACGGGGCCTGCCACTGCACCCCTGGCTGGACTGGACTCTTCTGCACACAGCNNNG 1994 

Qy 2238 T CCT CT AGGGTTT T AT GGAAAAGAT T GT GC ACT GAT AT GC CAAT GT CAAAAC GGAGCT GA 2297 

II II I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

Db 1995 CCCAGCAGCATTTTTTGGGAAGGACTGTGGGCGCGTATGCCAGTGTCAGAATGGCGCCAG 2054 

Qy 2298 CTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTCATGGGACGGCACTGTGA 2357 

I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I Ml I I I I I I I I 

Db 2055 CTGTGACCACATCAGTGGCAAGTGCACCTGCCGCACAGGCTTCACCGGGCAACACTGTGA 2114 

Qy 2358 GC AGAAGT GC C CT T C AGGAAC AT AT G GCT AT GGCT GT C GC C AGAT AT GT GAT T GT CT GAA 2417 

I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

Db 2115 GCAGAGAT GT GCCCCAGGAACCTTT GGCTAT GGGT GNNAGCAGCT AT GT GAGT GCAT GAA 2174 

Qy 2418 CAACTCCACCTGCGACCACATCACTGGGACCTGTTACTGCAGCCCCGGATGGAAGGGAGC 2477 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2175 CAACTCCACCTGTGACCATGTCACCGGCACCTGTTACTGCAGCCCTGGCTTCAAAGGAAT 2234 

Qy 247 8 GAGATGTGATCAAGCTGGTGTTATCATAGTTGGAAATCTGAACAGCTTAAGCCGAACCAG 2537 

I I I II I I I I I I I I I I I I I I I I I I I I I I II II I I I I 

Db 2235 CAGGTGTGACCAAGCTGCCCTCATGATGGAGG AGCT GAAT C C CT AC AC C AAGAT CAG 2291 

Qy 2538 TACT GCT CT C C CT GCT GATT C CT AC C AGAT C GGG GC C ATT GCAG GCAT CAT CAT T CT TGT 2597 

I I I II I I I I I II I I I I I I I I I I I I I I I I I III I 

Db 2292 CCCAGCACTGGGT GCAGAGCGGCACT CGGT GGGT GCT GT CACAGGCAT CAT GCT CCT GTT 2351 

Qy 2598 CCTAGTTGTTCTCTTCCTACTGGCATTGTTCATTATTTATAGACACAAGCAGAAGGGAAA 2657 

I I I I I I I I I I I I III I I I I I I I I I I I I I 

Db 2352 ATTCCTCATTGTGGTGCTGCTGGGCCTATTTGCCTGGCATCGGCGGCGGCAGAAAGAGAA 2411 

Qy 2658 GGAAT CAAGCAT G C CAG CAG T T AC C T AC AC C C C T G C T AT GAG G GT C GT CAAT G CAG A 2714 

I I III II II I II I I I I I I I I I I I I I I I I II III 

Db 2412 AGGC C GAGAC CT GGCT C CCCGTGTCT C CT ACAC AC CT GC C AT GAGGAT GAC CAG CAC C GA 2471 



Qy 



2715 TTATACCATTTCAGG 2729 



Db 2472 CTACNNNNNNNNNNG 24 8 6 
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AK053551 3556 bp mRNA linear HTC 20-SEP-2003 

Mus musculus 0 day neonate eyeball cDNA, RIKEN full-length enriched 
library, clone : E130107B17 product : similar to MEGF12 [Mus musculus], 
full insert sequence. 
AK053551 

AK053551.1 GI:26343538 
HTC; CAP trapper. 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Rodentia; 
1 

Carninci,P. and Hayashizaki , Y. 

High-efficiency full-length cDNA cloning 
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prepare full-length cDNA libraries for rapid discovery of new genes 

Genome Res. 10 (10), 1617-1630 (2000) 

20499374 

11042159 
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Yoneda,Y., Ishikawa,T., Ozawa, K. , . Tanaka, T . , Matsuura,S., Kawai,J., 
Okazaki,Y., Muramatsu,M. , Inoue,Y., Kira,A. and Hayashizaki, Y. 
RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer 
Genome Res. 10 (11), 1757-1771 (2000) 
20530913 
11076861 
4 

The RIKEN Genome Exploration Research Group Phase II Team and the 
FANTOM Consortium. 

Functional annotation of a full-length mouse cDNA collection 

Nature 409, 685-690 (2001) 
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The FANTOM Consortium and the RIKEN Genome Exploration Research 
Group Phase I & II Team. 

Analysis of the mouse transcriptome based on functional annotation 
of 60,770 full-length cDNAs 
Nature 420, 563-573 (2002) 
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Adachi,J., Aizawa,K., Akimura,T., Arakawa,T., Bono,H., Carninci,P., 



Fukuda,S., Furuno,M., Hanagaki,T., Hara,A., Hashizume,W. , 
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Nakamura,M., Nishi,K., Nomura, K., Numazaki,R., Ohno,M., Ohsato,N., 
Okazaki,Y., Saito,R., Saitoh, H., Sakai,C, Sakai,K., Sakazume,N., 
Sano,H., Sasaki, D. , Shibata,K., Shinagawa, A. , Shiraki,T., 
Sogabe,Y., Tagami,M., Tagawa,A., Takahashi, F. , Takaku-Akahira, S . , 
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TITLE Direct Submission 

JOURNAL Submitted ( 16- JUL-2001 ) Yoshihide Hayashizaki, The Institute of 
Physical and Chemical Research (RIKEN) , Laboratory for Genome 
Exploration Research Group, RIKEN Genomic Sciences Center (GSC) , 
RIKEN Yokohama Institute; 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 
Kanagawa 230-0045, Japan (E-mail : genome-res@gsc . riken . go . jp, 
URL :http:/ /genome. gsc. riken. go. jp/, Tel : 81-4 5-503-9222 , 
Fax:81-45-503-9216) 
COMMENT cDNA library was prepared and sequenced in Mouse Genome 

Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 

Please visit our web site for further details. 
URL: http : //genome . gsc . riken . go . jp/ 
URL : http : //fantom. gsc. riken. go. jp/ . 
FEATURES Location/Qualifiers 
source 1. .3556 

/organism="Mus musculus" 

/mol_type="mRNA" 

/strain="C57BL/6J" 

/db_xref="FANTOM_DB:E130107B17" 

/db_xref="MGI : 2425343" 

/db_xref="taxon: 10090" 

/clone="E130107B17" 

/tissue_type="eyeball" 

/clone_lib="RIKEN full-length enriched mouse cDNA library" 
/dev_stage="0 day neonate" 
CDS 205. .3309 

/note="unnamed protein product; putative 

similar to MEGF12 [Mus musculus] ( SPTR | AAL33583 , evidence: 

FASTY, 72.7%ID, 41.5%length, match=1604)" 

/codon_start=l 

/protein_id="BAC35426.1" 

/db_xref="GI: 26343539" 

/translation="MPLCPLLLLALGLRLTGTLNSNDPNVCTFWESFTTTTKESHLRP 
FSLLPAESCHRPWEDPHTCAQPTWYRTVYRQWKMDSRPRLQCCRGYYESRGACVPL 
CAQECVHGRCVAPNQCQCAPGWRGGDCSSECAPGMWGPQCDKFCHCGNNSSCDPKSGT 
CFCPSGLQPPNCLQPCPAGHYGPACQFDCQCYGASCDPQDGACFCPPGRAGPSCNVPC 
SQGTDGFFCPRTYPCQNGGVPQGSQGSCSCPPGWMGVICSLPCPEGFHGPNCTQECRC 
HNGGLCDRFTGQCHCAPGYIGDRCQEECPVGRFGQDCAETCDCAPGARCFPANGACLC 
EHGFTGDRCTERLCPDGRYGLSCQEPCTCDPEHSLSCHPMHGECSCQPGWAGLHCNES 
CPQDTHGPGCQEHCLCLHGGLCLADSGLCRCAPGYTGPHCANLCPPDTYGINCSSRCS 
CENAIACSPIDGTCICKEGWQRGNCSVPCPLGTWGFNCNASCQCAHDGVCSPQTGACT 
CTPGWHGAHCQLPCPKGQFGEGCASVCDCDHSDGCDPVHGQCRCQAGWMGTRCHLPCP 
EGFWGANCSNTCTCKNGGTCVSENGNCVCAPGFRGPSCQRPCPPGRYGKRCVQCKCNN 



NHSSCHPSDGTCSCLAGWTGPDCSEACPPGHWGLKCSQLCQCHHGGTCHPQDGSCICT 
PGWTGPNCLEGCPPRMFGVNCSQLCQCDLGEMCHPETGACVCPPGHSGADCKMGSQES 
FTIMPTSPVTHNSLGAVIGIAVLGTLWALIALFIGYRQWQKGKEHEHLAVAYSTGRL 
DGSDYVMPDVSPSYSHYYSNPSYHTLSQCSPNPPPPNKVPGSQLFVSSQAPERPSRAH 
GRENHVTLPADWKHRREPHERGASHLDRSYSCSYSHRNGPGPFCHKGPISEEGLGASV 
MSLSSENPYATIRDLPSLPGEPRESGYVEMKGPPSVSPPRQSLHLRDRQQRQLQPQRD 
SGTYEQPSPLSHNEESLGSTPPLPPGLPPGHYDSPKNSHIPGHYDLPPVRHPPSPPSR 
RQDR" 

ORIGIN 

Query Match 18.9%; Score 646.2; DB 11; Length 3556; 

Best Local Similarity 57.1%; Pred. No. 6.8e-173; 

Matches 1239; Conservative 0; Mismatches 918; Indels 12; Gaps 3; 



Qy 74 CT CT GAAT CTT GAAGACCCTAATGT GT GTAGCCACT GGGAAAGCT ACT CAGT GACT GT GC 133 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2 57 CACT CAAC TCCAAT GAT C C CAAT GT CT GT AC CTTT T GGGAAAGCT T CAC CAC GAC CACT A 316 

Qy 134 AAGAGT CAT AC C CAC AT C C CT T T GAT C AAAT T T AC T AC AC GAG CT GC ACT GAC AT T C 190 

I I I I I I I I I I I I I I I I I I I II I I I 

Db 317 AGGAGTCCCACCTGCGCCCCTTCAGCCTGCTCCCAGCTGAGTCCTGCCACAGGCCCTGGG 376 

Qy 191 TAAACTGGTTTAAAT GCACGC GGCACAGAGT CAGCTAT CGGACAGCCT AT CGACAT GGGG 2 50 

II I II I I I I I I 1 I I I I I II I I II II II I II 

Db 377 AGGACCCCCACACCTGTGCCCAGCCTACGGTTGTCTACCGGACTGTGTACCGTCAGGTGG 436 

Qy 251 AGAAGACTAT GTAT AGGCGCAAGT CTCAGT GTT GT CCT GGATTTT AT GAAAGCGGGGAAA 310 

Mill I II I I I I I I I I I I I I I I I I I I I I 

Db 437 TGAAGATGGACTCCCGCCCACGCCTGCAGTGCTGTAGGGGTTACTACGAGAGCAGAGGGG 4 96 

Qy 311 T GT GT GT CCCC CACT GT GCT GAT AAAT GT GT CCAT GGT CGCT GT ATT GCT C CAAACAC CT 370 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 497 CCTGTGTCCCACTCTGTGCCCAGGAGTGTGTCCATGGTCGCTGTGTGGCTCCGAATCAGT 556 

Qy 371 GTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGATGGTGATCACT 4 30 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I 

Db 557 GCCAGTGTGCACCAGGCTGGCGGGGTGGCGACTGCTCCAGCGAGTGTGCCCCGGGAATGT 616 

Qy 431 GGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGCAACCCCATCA 4 90 

I I I I I I I I I I II I I I I I II II I M I I I I I I 

Db 617 GGGGAC CACAGT GT GACAAGT T CT GC C ACT GT GG CAACAACAGT T CCT GT GAT C C CAAGA 676 

Qy 491 CCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGACCGCTGTGAGC 550 

I I I I I II II I I I I I I I I I I I I I I I Ml 

Db 677 GTGGGACGTGCTTTTGCCCCTCTGGCCTGCAGCCCCCCAACTGCCTTCAGCCCTGCCCTG 736 

Qy 551 AGGGCAC CT AT GGT AAC GACT GT CAT CAGAGAT GC CAGT GC CAGAAT GGAGC CAC CT GCG 610 

Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I Mill 

Db 737 CCGGCCACTATGGTCCTGCCTGCCAGTTTGATTGCCAGTGC TAT GGGG CAT CCTGTG 793 

Qy 611 ACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTCTGTGAGGATC 670 

I I I I II I I II M MM I II I I I I 

Db 794 ACCCCCAGGATGGAGCCTGTTTCTGCCCTCCAGGGAGAGCAGGACCCAGCTGTAATGTGC 853 

Qy 671 TTTGTCCTCCT GGT AAAC AT GGT C CACAGT GT GAGC AGAGAT GC C CT T GT C AAAAT G GAG 730 

I I I I I I I I MM II I I M I M I I I I M I I II 

Db 854 CCTGTTCACAGGGCACTGATGGCTTCTTCTGCCCCAGAACCTATCCTTGCCAAAATGGAG 913 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



731 



914 



791 



974 



GAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGATGGGCACAGTGT 

Ml III I I II III I I I II I I I I I I I I I II I M 

GTGTTCCTCAGGGCTCCCAAGGCTCCTGCAGCTGCCCACCGGGCTGGATGGGTGTCATTT 



790 



973 



850 



GTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAATGCCAGTGCC 

II I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I MM 

GTTCCCTGCCATGCCCAGAGGGTTTCCATGGACCCAACTGTACTCAGGAATGTCGCTGCC 1033 



851 



910 



ATAAT GGAGGGACGT GT GAT GCT GCCACAGGCCAAT GTCATT GCAGTCCAGGATACACAG 

I I I I I I I II II I I I I I II I I II M M I II I I I I 

1034 AC/^ACGGTGGCCTCTGTGACAGGTTTACTGGGCAGTGCCACTGTGCTCCTGGCTATATCG 1093 

911 GGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGTGCTGAGACCT 970 

I I I I I I I I I I II II I II II II II II II II I II II I M II I II I 

1094 GGGATCGGTGCCAAGAAGAGTGCCCCGTGGGCCGCTTCGGTCAAGACTGTGCTGAGACCT 1153 

971 GCCAGT GT GT CAACGGAGGGAAGTGTTAC CACGTGAGCGGCGCAT GCCT CTGT GAAGCAG 1030 

I I I I II III III I I I I M I I II II I 

1154 GTGACTGTGCTCCTGGCGCCCGTTGCTTTCCTGCTAATGGCGCGTGTCTGTGTGAACATG 1213 

1031 GCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAAT 1090 

I I I I I I I I I I I I II II I II I I I II I I I I II I I II I 

1214 GCTTCACAGGCGACCGCTGCACTGAGCGCCTCTGTCCGGATGGCCGCTATGGTCTGAGCT 1273 

1091 GTGACAAACGGTGTCCCTGCCACTTGGAAAACACTCATAGCTGTCACCCCATGTCTGGAG 1150 

I I II II I I I I I II II I I II II I I I I I I I II I II I II I 

1274 GCCAGGAGCCCTGCACCTGCGACCCAGAACACAGTCTCAGCTGCCACCCGATGCACGGCG 1333 

1151 AGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGTTCTCCTGGAT 1210 

II II II I I I II M I I II I I M I II II I I I I II II II III I 

1334 AGTGCTCCTGCCAGCCAGGTTGGGCGGGCCTCCACTGCAACGAGAGCTGCCCTCAGGACA 1393 

1211 TCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGACTGTGACAGTG 1270 

III I II I I I I I I I II I II I I I II I Ml I 

1394 CGCATGGCCCGGGCTGCCAGGAGCACTGCCTCTGTCTGCACGGAGGGCTCTGCCTTGCCG 14 53 

1271 T G AC T G G AAAGT G C AC CTGTGCCC C AG GAT T C AAAG G AAT T GAC T G C T C T AC C C CAT G C C 1330 

I || Ml I I I I I I II I I I I III I I I I I I I M I I I II I 

1454 ACAGCGGCCTCTGCCGGTGCGCGCCGGGATACACGGGACCTCACTGCGCTAACCTATGTC 1513 

1331 CTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCT 1390 

I I I I I I I I II I I I II I I II II I I I II II I II I I II M II III 

1514 CACCGGACACTTACGGGATCAACTGTTCCTCCCGCTGCTCCTGTGAAAATGCCATTGCCT 1573 

13 91 GCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCA 1450 

I I I I II I I I II II I II I II I I I I I II I I II II II I II II II 

1574 GCTCTCCCATCGACGGCACGTGCATCTGCAAGGAAGGTTGGCAGCGTGGTAACTGCTCTG 1633 

14 51 TCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACG 1510 

I I I I I I I I II II I II II I II I II I I I I I I I M II I II I 

1634 TTCCCTGTCCCCTTGGCACCTGGGGCTTCAATTGCAATGCCAGTTGCCAGTGTGCCCACG 1693 

1511 GGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGA 1570 

I I II II I II I I II II I M I II II I I I I II I I I I III 

1694 ACGGAGTCTGCAGCCCCCAAACTGGAGCCTGTACTTGCACCCCTGGGTGGCATGGTGCTC 1753 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 



1571 AATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACT 1630 

I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I II 

1754 ACTGCCAGCTTCCCTGCCCGAAGGGACAGTTTGGTGAAGGCTGTGCCAGTGTCTGTGACT 1813 

1631 GCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCGGGATGGT 1690 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1814 GT GAG CACT CT GAT G GCT GT GAC CCT GT T CAT GGACAGT GC C GAT GTC AGGCT GGT T GGA 1873 

1691 CAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGC 1750 

II I I I I II II I I I I I I I I I I I I I I I I I I I 

1874 TGGGCACACGCTGCCACCTGCCTTGCCCGGAGGGCTTTTGGGGAGCCAACTGCAGTAACA 1933 

1751 CCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCAC 1810 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1934 CCTGTACCTGCAAGAATGGTGGTACCTGTGTGTCTGAGAATGGCAACTGCGTGTGCGCAC 1993 

1811 CAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCT 1870 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1994 CAGGGTTCCGAGGCCCCTCCTGCCAGAGGCCCTGCCCGCCTGGTCGCTATGGCAAACGCT 2 053 

1871 GCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCACATCACCGGCC 1930 

I II I I I I I I I I I I I I I I I I I III 

2054 GTGTGCA -AT GCAAGT GTAACAACAAC CAT T CT T CCT GCCACC CAT C GGAC GGGA 2107 

1931 TGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCA 1990 

II I I I I I I I I I I I I I I I I I I I I I I III I I I I I I III 

2108 CCTGCTCCTGCCTGGCGGGCTGGACAGGCCCTGACTGCTCCGAGGCATGTCCCCCAGGCC 2167 

1991 GAT T T G GGAAAAACT GT GC AGGAAT T T GT AC CT GC AC CAACAAC G G AAC CT GT AAC C C C A 2 050 

I II I I I I I I I I I II I I I I I I I I I I I I I 

2168 ACT G GGGACT CAAAT GCT C CCAACT CT G C CAGT GT CAT CAT GGT GG GAC CT GC CAC CCC C 2227 

2051 TTGACAGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCTCAACCATGTC 2110 

II I III II I I I I I II I I I I I I I I I I II III 

2228 AGGATGGGAGCTGTATCTGCACGCCAGGCTGGACTGGACCCAACTGCTTGGAAGGCTGCC 2287 

2111 CACCTGCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCT 2170 

I I I I I II I I I I II III II III II I I I I II 

22 88 CACCAAGAAT GTTT GGT GT CAACTGCT CCCAGCT AT GT CAGTGT GATCT CGGAGAGAT GT 2347 

2171 GCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTCTACTGCACTC 2230 

II II I I I I I I III II I I I I I I II I I I I I I 

2348 GCCACCCAGAGACTGGGGCTTGTGTCTGTCCCCCAGGACACAGTGGTGCAGACTGCAAAA 24 07 

2231 AGAGATGTC 2239 

I I I I I 
2408 TGGGAAGCC 2416 



RESULT 8 
CD803668 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 



CD803668 789 bp mRNA linear EST 15-JUL-2003 

UI-M-GV0-chv-b-08-0-UI.rl NIH_BMAP_GV0 Mus musculus cDNA clone 
IMAGE: 30544831 5', mRNA sequence. 
CD803668 

CD803668.1 GI: 32462494 
EST. 



SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



Mus mus cuius (house mouse) 
Mus mus cuius 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 789) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 

Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 

Email : cgapbs-r@mail . nih. gov 

Tissue Procurement: Dr. Jim Lin, University of Iowa 
cDNA Library preparation: Dr. M. Bento Scares, University of Iowa 
cDNA Library Arrayed by: Dr. M. Bento Soares, University of Iowa 
DNA Sequencing by: Dr. M. Bento Soares, University of Iowa 
Clone Distribution: Distribution information can be found at 

http : //genome . uiowa . edu/distribution/mousef 1 . html 
This clone was contributed by the Brain Molecular Anatomy Project 

(BMAP) 

Seq primer: pYX-5. 

Location/Qualifiers 
1. .789 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="C57BL/6" 
/db_xref="taxon: 10090" 
/clone="IMAGE: 30544831" 
/tissue_type="whole brain" 
/dev_stage="l, 5, and 15 days newborn" 
/lab_host="DH10B (Tl phage resistant)" 
/clone_lib="NIH_BMAP_GVO" 

/note="Organ: Brain; Vector: pYX- Asc; Site_l: EcoR I; 
Site_2: Not I; The library was constructed according 
Bonaldo, Lennon and Soares, Genome Research, 6:791-8 06, 
1996. Denatured RNA was size fractionated on a 1% agarose 
gel. First strand cDNA synthesis was primed with oligo-dT 
primer containing a Not I site. Double strand cDNA was 
size selected according to mRNA size fraction, ligated with 
EcoR I adaptor, digested with NotI and then cloned 
directionally into pYX-Asc vector. The library tag 
sequence located between the Not I site and the polyA tail 
is CGAACTGAAT. This library was created for the University 
Iowa Brain Anatomy Project (BMAP) : 1 Gene Discovery in the 
Developing Mouse Nervous System 1 , supported by National 
Institute of Mental Health (NIMH), Hemin Chin, Ph.D., 
program coordinator." 



ORIGIN 



Query Match 17.2%; 
Best Local Similarity 86.3%; 
Matches 681; Conservative 



Score 587.6; DB 14; Length 789; 
Pred. No. 1.7e-156; 
0; Mismatches 104; Indels 4; Gaps 



3; 



Qy 

Db 



35 T T T GT TT AT T GT T AT GC CACT GGAT T GGGACAGC AT CAC CT CT GAAT C — TT GAAGACC C 92 

| | | I I I I I II I I I I I I I I I I I I II II I MINIM 

1 TCTGCTCACTGCTCTGTCACTGGGTGGGGACAGCATCCTCCCNTGAACCNTGGAAGACCC 60 



93 T AAT GT GT GTAGCCACT - GGGAAAGCT ACT CAGT GACT GT GCAAGAGT CAT AC C CAC AT C 151 
|| || M I I II II I M I M II II I I I I I I II I II I I II II I II M M I M M 



Db 



61 CAACGTATGCAGCCACTGGGGAAAGCTACTCGGTGACTGTGCAGGAGTCGTATCCACATC 120 



Qy 152 CCTTTGATCAAATTTACTACACGAGCTGCACTGACATTCTAAACTGGTTTAAATGCACGC 211 

I I I I I I I I I II I I I I I I I I I I I I I I II Mill II II I I I I! I I I I I II I I I I 

Db 121 CCTTCGATCAGAT CTACTACACAAGCT GCACCGACAT CCTGAACTGGTTTAAAT GCACAC 18 0 

Qy 212 GGCACAGAGT CAGCTAT CGGACAGCCTAT CGACAT GGG- GAGAAGACTAT GTAT AGGCGC 270 

I I I I I I I I I I I I I I I I I I I I I I I I II II II III II I I I II I I I I I I I I IN 

Db 181 GGCACAGAATCAGCTACCGGACAGCCTACCGCCACGGGNGAGAAAAC CAT GTAT AGACGC 240 

Qy 271 AAGTCTCAGTGTTGTCCTGGATTTTATGAAAGCGGGGAAATGTGTGTCCCCCACTGTGCT 330 

II II I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 AAATCCCAGTGTTGCCCAGGATTTTATGAAAGCCGAGACATGTGTGTCCCTCACTGTGCT 300 

Qy 331 GATAAATGTGTCCATGGTCGCTGTATTGCTCCAAACACCTGTCAGTGTGAGCCTGGCTGG 390 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I II I I I I I 

Db 301 GATAAATGTGTCCATGGTCGCTGCATTGCTCCAAACACCTGTCAGTGTGAGCCTGGCTGG 360 

Qy 391 GGAGGGAC CAACT GCT C C AGT GC CT GC GAT GGT GAT C AC TGGGGTCCC CACT G C AC C AGC 4 50 

II I I I I I I I I I II I I I I I I II I I I I I II I I II I I II I I II I I I I I I I I I I I 

Db 361 GGTGGGACCAACTGTAGCAGTGCTTGTGATGGTGATCACTGGGGGCCTCACTGCAGCAGC 420 

Qy 451 CGGTGCCAGTGCAAAAATGGGGCTCTGTGCAACCCCATCACCGGGGCTTGCCACTGTGCT 510 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 CGATGCCAGTGCAAAAACAGAGCTTTGTGTAACCCCATCACCGGTGCTTGCCACTGCGCT 4 80 

Qy 511 GCGGGCTTCCGGGGCTGGCGCTGCGAGGACCGCTGTGAGCAGGGCACCTATGGTAACGAC 57 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II MINIMI 

Db 481 GCGGGCTACCGGGGATGGCGCTGCGAGGACCGTTGTGAACAGGGCACGTACGGTAACGAC 54 0 

Qy 571 T GT CAT C AG AG AT G C C AGT G C C AG AAT G GAG C C AC C T G C G AC C AC GT C AC G G G G G AAT G C 630 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 541 TGT CAC CAAAGAT GCC AGT GTCAGAAT GGAGC GAC CT GT GAC C ACAT CACT GGG GAAT GC 600 

Qy 631 CGCTGCCCACCAGGATACACCGGAGCCTTCTGTGAGGATCTTTGTCCTCCTGGTAAACAT 690 

II II I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I MM 

Db 601 CGTTGTTCACCTGGGTACACTGGAGCCTTCTGTGAGGATCTTTGTCCTCCTGGCANACAT 660 

Qy 691 GGT CCACAGT GT GAGC AGAGAT GC CCT T GT CAAAAT G GAGGAGT GT GT CAT CAC GT CACT 750 

I II I I I I I I I II I I II I I I II II II II Mill II I II II II II I I I I II 

Db 661 GGTCCACATTGTGAGCAGAGGTGTCCCTGCCANAATGGGGGTGTGTGCCACCATGTCACT 720 

Qy 751 GGAGAATGCTCTTGCCCTTCTGGCTGGATGGGCACAGTGTGTGGTCAGCCTTGCCCCGAG 810 

II I II I I I I II I I M I II M I I I I II I M II II I I II M II II M II I Mill II 

Db 721 GGAGAGTGCTCTTGCCCTTCTGGNTGGATGGGCACAGTGTGTGGTCAGCCCTGCCCTGAN 7 80 

Qy 811 GGTCGCTTT 819 

II II II M I 
Db 781 GGTCGCTTT 78 9 



RESULT 9 
BU215784 

LOCUS BU215784 921 bp mRNA linear EST 25-NOV-2002 

DEFINITION 603106167F1 CSEQCHN04 Gallus gallus cDNA clone ChEST45fl8 5', mRNA 

sequence . 
ACCESSION BU215784 



VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
MEDLINE 
PUBMED 
COMMENT 



FEATURES 

source 



Burt, D.W. , Bosch, E. 
and Hubbard, S.J. 



BU215784. 1 GI:25394329 
EST. 

Gallus gallus (chicken) 
Gallus gallus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Archosauria; Aves; Neognathae; Gallif orines ; Phasianidae; 
Phasianinae; Gallus. 
1 (bases 1 to 921) 

Boardman, P . E. , Sanz-Ezquerro, J. , Overton, I. M. 
Fong,W.T., Tickle, C, Brown, W . R. A. , Wilson, S. A. 
A Comprehensive Collection of Chicken cDNAs 
Curr. Biol. 12 (22), 1965-1969 (2002) 
22335534 
12445392 

Contact: Simon Hubbard 

Department of Biomolecular Sciences 

University of Manchester Institute of Science and Technology 
(UMIST) 

PO Box 88, Manchester, M60 1QD, UK 
Tel: 01612008930 
Fax: 01612360409 

Email: Simon.Hubbard@umist.ac.uk. 
Location/Qualif iers 
1. .921 

/organism="Gallus gallus" 
/mol_type="mRNA" 
/strain="White Leghorn, Hisex" 
/db_xref="taxon: 9031" 
/clone="ChEST45fl8" 
/tissue_type="whole embryo" 
/dev_stage="20-21" 
/lab_host="DH10B" 
/clone_lib="CSEQCHN04 " 

/note="0rgan : whole embryo; Vector: pBluescript II KS(+); 
Site_l: EcoRI; Site_2 : Notl; This normalized library was 
constructed from 1 million independent clones. cDNA 
synthesis was initiated using an oligo(dT) primer, using 
methylated C in the first strand synthesis reaction. 
Following this first strand reaction, double-stranded cDNA 
was blunted, ligated to Notl adapters, digested with 
EcoRI, size-selected, and cloned into the Notl and EcoRI 
compatible sites of a custom modified MCS of the 
pBluescript (KS+) vector. The library was normalized in 2 
rounds using conditions adapted from Soares et al., PNAS 
(1994) 91: 9228-9232 and Bonaldo et al . , Genome Research 6 
(1996) : 791, except that a significantly longer 
reannealing hybridization was used." 



ORIGIN 



Query Match 15.7%; Score 539; DB 13; Length 921; 

Best Local Similarity 75.6%; Pred. No. 1.7e-142; 

Matches 695; Conservative 0; Mismatches 22 0; Indels 4; 



Gaps 



2; 



Qy 

Db 



787 GTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAATGCCAG 846 
I! I I I I I I I I I I I I I I II II I I II I I III II I I I I I I I I I I I II II III 
3 GTCTGTGGTCAGCCTTGTCCTGAAGGTCGTTATGGGAATAACTGTTCCCAGGAGTGTCAG 62 



Qy 

Db 


847 
63 


TGCCATAAT GGAGGGAC GT GT GAT GCT GCCACAGGCCAAT GT CATT GCAGTCCAGGATAC 

1 I 1 I 1 1 1 M 1 1 1 1 1 1 M II 1 II M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 

TGCCACAATGGAGGGATCTGCGACTCAGCGACAGGTCAATGCCATTGCAGCCCAGGCTAT 


906 
122 


Qy 


907 


ACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGTGCTGAG 

1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 II II II II II II 1 1 1 1 1 1 1 1 II 

ACAGGAGAACGCT GTCAAGATGAGTGTCCAGT GGGCACTT ACGGAGTAC GGTGT GCTGAG 


966 


Db 


123 


182 


Qy 


967 


ACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTCTGTGAA 

1 1 1 1 II M 1 1 1 1 1 II II 1 1 1 1 1 M 1 1 1 1 1 1 Mill M II II 1 II 1 II II 

AC CT G C CAGT GT AT GAAT GGT G G GAAAT G CT AC CAT AT C AGC GGTGCTTGCCTCTGT GAA 


1026 


Db 


183 


242 


Qy 


1027 


GCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGCATC 

I I I I | 1 1 1 1 1 1 II 1 III III II 1 II II II 1 II II 1 1 1 II II M 

CCAGGATATACTGGAGAGCACTGTGAAACAAGGCTTTGCCCTGAGGGGGTTTATGGTCTC 


1086 


Db 


243 


302 


Qy 

Db 


1087 
303 


AAAT GT GACAAAC GGT GT C C CT GCCACTT GGAAAACACT C AT AGCT GT C ACC C CAT GT CT 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 II 1 M II 1 II II 1 1 II 1 II M II 1 II 1 II 

AAATGT GACAAAAAGT GT CCCT GCCACT T GCATAATACCT GGAGCT GT CACCCTAT GT CT 


1146 
362 


Qy 


1147 


GGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGTTCTCCT 

II || II 1 1 II 1 II 1 II II II 1 1 1 1 II 1 1 1 II 1 M II II 1 M M II 

GGGGAATGCTCCTGCAAGCCCGGCTGGTCTGGGCTCTACTGCAACGAGACATGTTCTCCG 


1206 


Db 


363 


422 


Qy 


1207 


GGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGACTGTGAC 

II 1 1 II 1 1 1 1 1 MM II II II 1 1 II II 1 1 II II 1 M 1 1 II II 1 1 1 M M M 

GGGTTCTACGGCAAGTCTTGTCAGCAGATTTGCAGCTGCCAAAATGGTGCTGACTGTGAC 


1266 


Db 


423 


482 


Qy 


1267 


AGTGTGACTGGAAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCTACCCCA 

M 1 1 II 1 1 1 II II 1 1 1 1 1 II II 1 1 1 II II 1 II M Ml M III II 

AGCGT GAC T GGAAAAT GCATCT GT GC CT CT G GATT T AAG GGT GCT GAT T GT GGT ACT C CT 


1326 


Db 


483 


542 


Qy 


1327 


TGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCA 

I I I M 1 1 1 II 1 1 1 II II 1 1 1 1 II 1 1 1 1 1 1 1 II M 1 1 II 1 II II 

TGTCTTCCGGGTACATATGGAGTAAATTGTTCTTCTGTCTGCAACTGTAAAAACGAAGCC 


1386 


Db 


543 


602 


Qy 


1387 


GTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGC 

I I M M 1 II II II II II II 1 1 II II 1 M M 1 1 M II II II II II 1 

ATCT GTT CAT CAGT AGAT GGT T CTT GT AC CT GCAAAGC AGGT T GGCAT G GT GT AGACT GT 


1446 


Db 


603 


662 


Qy 


1447 


TCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTC 

II MM 1 1 1 II 1 II II 1 1 1 II 1 1 1 II II II 1 1 1 II II 1 II II II 1 1 M 

TCAATCAATTGTCCCAGCGGTACCTGGGGACTTGGCTGCAACTTAACTTGCCAGTGTCTT 


1506 


Db 


663 


722 


Qy 


1507 


AACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGG 

Mill II II II 1 II 1 M II II II 1 1 1 II II II M M 1 1 1 1 II 

AACGGAGGGGCTTGCAATGCTCTAGATGGAACCTGTACCTGCGCTCCGGGCTGGAGAGGA 


1566 


Db 


723 


782 


Qy 


1567 


GAGAAATGCGAACTTCCCT— GCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCT 

M II 1 1 II II 1 II II II 1 1 1 1 1 M M M 1 M II II II 1 1 II 1 

GAGAACGTGTGAACTCCCTTGGCCAGGACGGTACCTATGGCATGGATTGTGCTGAGCGCT 


1624 


Db 


783 


842 


Qy 


1625 


GCG — ACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCC 

|| | || II 1 1 II M 1 1 II II II II 1 II 1 II 1 M 1 1 1 M 

GTGGACCTGCAGCATGCAGATGGGTTGTCATCCTACCACGGGTTACTGTCCCTGTCTACC 


1682 


Db 


843 


902 


Qy 


1683 


GGGATGGTCAGGTGTCCAC 1701 





I I I I I II I 

Db 903 AGATGGTCCGGTTTCCCTC 921 



RESULT 10 

CD802967 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



CD802967 755 bp mRNA linear EST 15-JUL-2003 

UI-M-GV0-chl-j-22-0-UI.rl NIH_BMAP_GV0 Mus musculus cDNA clone 
IMAGE: 30543885 5', mRNA sequence. 
CD802967 

CD802967. 1 GI: 324 61793 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 755) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Dr. Jim Lin, University of Iowa 
cDNA Library preparation: Dr. M. Bento Soares, University of Iowa 
cDNA Library Arrayed by: Dr. M. Bento Soares, University of Iowa 
DNA Sequencing by: Dr. M. Bento Soares, University of Iowa 
Clone Distribution: Distribution information can be found at 

http : //genome . uiowa . edu/distribution/mousef 1 . html 
This clone was contributed by the Brain Molecular Anatomy Project 

(BMAP) 

Seq primer: pYX-5. 

Location/Qualifiers 
1. .755 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="C57BL/6" 
/db_xref="taxon: 10090" 
/clone="IMAGE: 30543885" 
/tissue_type="whole brain" 
/dev_stage="l, 5, and 15 days newborn" 
/lab_host="DH10B (Tl phage resistant)" 
/ clone_l ib= "NIH_BMAP_GV0 " 

/note="0rgan: Brain; Vector: pYX- Asc; Site_l : EcoR I; 
Site_2: Not I; The library was constructed according 
Bonaldo, Lennon and Soares, Genome Research, 6:791-806, 
1996. Denatured RNA was size fractionated on a 1% agarose 
gel. First strand cDNA synthesis was primed with oligo-dT 
primer containing a Not I site. Double strand cDNA was 
size selected according to mRNA size fraction, ligated with 
EcoR I adaptor, digested with NotI and then cloned 
directionally into pYX-Asc vector. The library tag 
sequence located between the Not I site and the polyA tail 
is CGAACTGAAT. This library was created for the University 
Iowa Brain Anatomy Project (BMAP): 'Gene Discovery in the 
Developing Mouse Nervous System' , supported by National 
Institute of Mental Health (NIMH), Hemin Chin, Ph.D., 
program coordinator." 



ORIGIN 



Query Match 15.5%; Score 530.4; DB 14; Length 755; 

Best Local Similarity 84.5%; Pred. No. 4.4e-140; 

Matches 594; Conservative 0; Mismatches 109; Indels 0; Gaps 0; 

Qy 1 ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 60 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

Db 52 ATGGCGATTTCTTCAAGTTCGTGCCTGGGCCTCATCTGCTCACTGCTCTGTCACTGGGTG 111 

Qy 61 GGGACAGCAT CAC CT CT GAAT CTT GAAGAC C CT AAT GT GTGT AGC C ACT G GGAAAGCT AC 120 

I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 112 GGGACAGCATCCTCCCTGAACCTGGAAGACCCCAACGTATGCAGCCACTGGGAAAGCTAC 171 

Qy 121 T CAGT GACTGT GCAAGAGT CAT AC C C ACAT C C CTT T GAT CAAAT T T ACTAC ACGAGCT GC 180 

II I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 172 T C GGT GACTGT GC AGGAGT C GT AT C CAC AT C C CTT C GAT CAGAT CT ACT ACAC AAGCT GC 231 

Qy 181 ACTGACATTCTAAACTGGTTTAAATGCACGCGGCACAGAGTCAGCTATCGGACAGCCTAT 240 

II I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 232 ACC GAC AT C CT GAACT GGTTTAAAT GC ACAC GG CAC AGAAT CAGCT AC C GGAC AGC CTAC 291 

Qy ,241 C GACAT G GGGAGAAGACTAT. GTATAGG C GCAAGT CTCAGT GTT GT CCT GGAT T T T AT GAA 300 

II II I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I 

Db 292 C G C CAC G G G G AG AAAAC CAT GT AT AG AC G CAAAT C C CAGT GT T G C C C AG GAT T T TAT GAA 351 

Qy 301 AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCT 360 

III I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 352 AGCCGAGACATGTGTGTCCCTCACTGTGCTGATAAATGTGTCCATGGTCGCTGCATTGCT 411 

Qy 361 CCAAAC AC CT GT CAGT GT GAGCCT G GCT GG GGAGGGAC CAACTGCT C CAGT GCCT GC GAT 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II Ml 

Db 412 CCAAACACCTGTCAGTGTGAGCCTGGCTGGGGTGGGACCAACTGTAGCAGTGCTTGTGAT 471 

Qy 421 GGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGC 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 11 I I I I 

Db 472 GGTGATCACTGGGGGCCTCACTGCAGCAGCCGATGCCAGTGCAAAAACAGAGCTTTGTGT 531 

Qy 481 AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I II 

Db 532 AACCCCATCACCGGTGCTTGCCACTGCGCTGCGGGCTACCGGGGATGGCGCTGCGAGGAC 591 

Qy 541 CGCTGTGAGCAGGGCACCTATGGTAACGACTGTCATCAGAGATGCCAGTGCCAGAATGGA 600 

II I I I I I I I I I II II II II I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I 

Db 592 CGTT GT GAACAGGGCACGTACGGTAACGACT GTCACCANAGATGCCAGTGT CAGAAT GGA 651 

Qy 601 GCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTC 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 652 GCGACCTGTGACCACATCACTGGGGAATGCCGTTGTTCACCTGNGTACACTNGGAGCTTC 711 

Qy 661 T GT GAGGAT CT TT GT C CT C CT GGT AAAC AT G GT C C AC AGT GT G 703 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 

Db 712 TGT GAGGAT CTTT GT CCT CCT GGCAACAT GGT CACATT GTGAG 754 



RESULT 11 
BU056532 



LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BU056532 643 bp mRNA linear EST 26-AUG-2002 

UI-M-FO0-cab-n-24-0-UI . rl NIH_BMAP_FO0 Mus musculus cDNA clone 
IMAGE: 64 09175 5', mRNA sequence. 
BU056532 

BU056532.1 GI: 22496609 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 643) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Dr. Jim Lin, University of Iowa 
cDNA Library preparation: Dr. M. Bento Soares, University of Iowa 
cDNA Library Arrayed by: Dr. M. Bento Soares, University of Iowa 
DNA Sequencing by: Dr. M. Bento Soares, University of Iowa 
Clone Distribution: MGC clone distribution information can be 

found through the I.M.A.G.E. Consortium/LLNL at: 

http : //image . llnl . gov 
This clone was contributed by the Brain Molecular Anatomy Project 

(BMAP) 

Seq primer: pYX-5. 

Location/ Qualifiers 
1. .643 

/organism="Mus musculus" 

/mol_type="mRNA" 

/strain="C57BL/6" 

/db_xref="taxon: 10090" 

/ cl one= " IMAGE :6409175" 

/tissue_type="whole brain" 

/dev_stage="embryo 12 . 5dpc" 

/lab_host="DH10B (Tl phage resistant)" 

/clone_lib="NIH_BMAP_FOO" 

/note="0rgan: Brain; Vector: pYX- Asc; Site_l: EcoR I; 
Site_2: Not I; The library was constructed according 
Bonaldo, Lennon and Soares, Genome Research, 6:791-806, 
1996. Denatured RNA was size fractionated on a 1% agarose 
gel. First strand cDNA synthesis was primed with oligo-dT 
primer containing a Not I site. Double strand cDNA was 
size selected according to mRNA size fraction, ligated 
with EcoR I adaptor, digested with NotI and then cloned 
directionally into pYX-Asc vector. The library tag 
sequence located between the Not I site and the polyA tail 
is TGAGAGAGCC. This library was created for the University 
Iowa Brain Anatomy Project (BMAP) : 'Gene Discovery in the 
Developing Mouse Nervous System' , supported by National 
Institute of Mental Health (NIMH), Hemin Chin, Ph.D., 
program coordinator." 



ORIGIN 



Query Match 14.9%; Score 508.6; DB 13; Length 643; 

Best Local Similarity 86.9%; Pred. No. 7.2e-134; 

Matches 559; Conservative 0; Mismatches 84; Indels 0; Gaps 



0; 



Qy 

Db 



2222 ACT GCACT CAGAGAT GT C CT CT AGG GTT T TAT GGAAAAGAT T GT GC ACT GAT AT GC CAAT 22 81 
I I I I I I I I I I I I I I I I I I I II I I I I I II II II II I I I I I I I I I I I I I I I I I I I 
1 ACT GCACT CAGAGATGCCCT CT GGGCTT CTAT GGT AAGGACT GT GCACT GAT AT GCCAAT 60 



Qy 2282 GT CAAAACGGAGCTGACT GCGACCACATTTCT GGGCAGT GTACTT GCC GCACT GGATT CA 2341 

I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 GTCAAAACGGAGCTGACTGCGACCATATCTCGGGGCAGTGTACCTGCCGCACGGGATTCA 120 

Qy 2342 TGGGACGGCACTGTGAGCAGAAGTGCCCTTCAGGAACATATGGCTATGGCTGTCGCCAGA 2401 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 TGGGACGGCACTGTGAACAGAAGTGCCCTGCGGGAACATACGGCTATGGCTGTCGCCAGA 18 0 

Qy 2402 TAT GT GATT GT CT GAACAACTCCACCTGCGACCACATCACT GGGACCT GTTACT GCAGCC 2461 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I 

Db 181 TCTGTGACTGTCTGAACAACTCCACCTGTGACCACATCACTGGCACGTGTTACTGTAGCC 24 0 

Qy 24 62 CCGGATGGAAGGGAGCGAGATGTGATCAAGCTGGTGTTATCATAGTTGGAAATCTGAACA 2521 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 C AGGATGGAAAGGGGCAC GAT GT GAC CAAGCT GGG GTT AT CAT C GT GG G CAAT CT GAACA 300 

Qy 2522 GCTTAAGCCGAACCAGTACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCAG 2581 

I I I I I I I I I I I I LI I M M II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 301 GCTTAAGCCGGACCAGCACCGCCCTTCCTGCCGATTCCTATCAGATCGGGGCCATCGCGG 360 

Qy 2582 GC AT CAT C ATT CTT GT C CT AGT TGTTCTCTTC CT ACT GG C AT T GT T CAT TAT T TAT AG AC 2641 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 GCATCGTGGTCCTCGTTCTTGTTGTGCTCTTCCTGCTGGCGCTGTTCATCATCTACAGAC 42 0 

Qy 2642 ACAAGCAGAAGGGAAAGGAAT CAAGCAT GCCAGCAGTTACCTACACCCCTGCTAT GAGGG 27 01 

I I II I II I I I I I I I I I I I I I I II I I I I I I II II I I I I I I I I I I I II I I I I I I I 
Db 421 ACAAGCAGAAGAGGAAGGAATCAAGCATGCCGGCCGTGACCTACACCCCCGC CAT GAGGG 480 

Qy 2702 T CGT CAAT GCAGATTATACCATT TCAGGAACCCTT CCT CACAGCAAT GGT GGAAACGCTA 2761 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 4 81 T CATCAAT GCAGACT ATACCAT CGCAGAAAC CCTGCCT CACAGCAAT GGT GGAAAT GCCA 540 

Qy 27 62 AT AG C CACT ACT T C AC CAAT C C CAGT T AC CAC AC G CT C AC C C AGT GT G C C AC AT C C C CT C 2821 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

Db 541 ACAGCCACT ACT TT ACCAAT C C CAGT TAT CACAC ACTT AGC CAGT GTGC CACAT C C C CT C 600 

Qy 2822 AC GT C AAC AAC AGG GACAGGAT GACT GT CAC GAAGT CAAAAAA 2864 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 601 AT GT GAAC AAT AG G GAC AG GAT GAC CAT T GC AAAGT CAAAAAA 643 



RESULT 12 

AK048840 

LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



AK048840 2944 bp mRNA linear HTC 20-SEP-2003 

Mus musculus 0 day neonate cerebellum cDNA, RIKEN full-length 
enriched library, clone : C230074H09 product :MEGF11 PROTEIN 
(KIAA1781) homolog [Homo sapiens], full insert sequence. 
AK048840 

AK048840. 1 GI: 2 6339599 
HTC; CAP trapper. 
Mus musculus (house mouse) 
Mus musculus 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 
PUBMED 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
REFERENCE 
AUTHORS 



TITLE 
JOURNAL 



Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus . 
1 

Carninci,P. and Hayashizaki, Y. 

High-efficiency full-length cDNA cloning 

Meth. Enzymol. 303, 19-44 (1999) 

99279253 

10349636 

2 

Carninci,P., Shibata, Y., Hayatsu,N., Sugahara,Y., Shibata, K., 

Itoh,M., Konno, H., Okazaki,Y., Muramatsu,M. and Hayashizaki, Y . 

Normalization and subtraction of cap-trapper-selected cDNAs to 

prepare full-length cDNA libraries for rapid discovery of new gene 

Genome Res. 10 (10), 1617-1630 (2000) 

20499374 

11042159 

3 

Shibata, K 
Konno, H . , 



f Itoh,M. 
Akiyama, J. , 



Aizawa, K 
Nishi, K 

Sumi,N., Ishii,Y., Nakamura,S 
Yamamoto,R., Matsumoto, H . , Sakaguchi, S . , Ikegami,T. 
Fujiwake, S. , Inoue,K., Togawa,Y., Izawa,M., Ohara,E 



Nagaoka,S., Sasaki, N., 
Kitsunai, T . , Tashiro,H 
Hazama,M. , Nishine,T. , 



Carninci, P . 
, Itoh,M., 
, Harada / A. / 
Kashiwagi, K. , 
Watahiki,M. , 



Yoneda,Y. , Ishikawa,T. , _Ozawa,K. , Tanaka,T., Matsuura,S., Kawai,J. 
Okazaki,Y., Muramatsu, M. , Inoue f Y., Kira,A. and Hayashizaki, Y. 
RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer 
Genome Res. 10 (11), 1757-1771 (2000) 
20530913 
11076861 
4 

The RIKEN Genome Exploration Research Group Phase II Team and the 
FANTOM Consortium. 

Functional annotation of a full-length mouse cDNA collection 

Nature 409, 685-690 (2001) 

5 

The FANTOM Consortium and the RIKEN Genome Exploration Research 
Group Phase I & II Team. 

Analysis of the mouse transcriptome based on functional annotation 
of 60,770 full-length cDNAs 
Nature 420, 563-573 (2002) 
6 (bases 1 to 2944) 

Adachi,J., Aizawa, K. , Akimura,T., Arakawa,T., Bono,H., Carninci,P. 
Fukuda,S., Furuno,M. , Hanagaki,T., Hara,A. , Hashizume, W . , 
Hayashida,K. , Hayatsu,N., Hiramoto,K., Hiraoka,T., Hirozane,T., 
Hori,F., Imotani,K., Ishii,Y., Itoh,M., Kagawa,I., Kasukawa,T., 
Katoh,H., Kawai,J., Kojima,Y., Kondo,S., Konno, H., Kouda,M., 
Koya,S., Kurihara,C, Matsuyama, T . , Miyazaki,A., Murata,M. , 
Nakamura,M., Nishi, K., Nomura, K., Numazaki,R., Ohno,M., Ohsato,N., 
Okazaki,Y., Saito,R., Saitoh, H., Sakai,C, Sakai,K., Sakazume,N., 
Sano,H., Sasaki, D., Shibata, K., Shinagawa,A. , Shiraki,T., 
Sogabe,Y., Tagami,M., Tagawa,A., Takahashi, F. , Takaku-Akahira, S . , 
Takeda,Y., Tanaka,T., Tomaru,A., Toya,T., Yasunishi, A. , 
Muramatsu, M. and Hayashizaki, Y. 
Direct Submission 

Submitted ( 16- JUL-2001 ) Yoshihide Hayashizaki, The Institute of 
Physical and Chemical Research (RIKEN), Laboratory for Genome 
Exploration Research Group, RIKEN Genomic Sciences Center (GSC) , 



COMMENT 



FEATURES 

source 



CDS 



RIKEN Yokohama Institute; 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 
Kanagawa 230-0045, Japan (E-mail : genome-res @gsc . riken . go . jp, 
URL: http://genome.gsc. riken.go.jp/, Tel : 81-45-503-9222 , 
Fax:81-45-503-9216) 

cDNA library was prepared and sequenced in Mouse Genome 
Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN . 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 

Please visit our web site for further details. 
URL : http : / /genome . gsc . riken . go . jp/ 
URL : http : / / f antom. gsc . ri ken . go . jp/ . 

Location/Qualifiers 

1. .2944 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="C57BL/6J" 
/db_xref="FANTOM_DB:C2 30074H09" 
/db_xref="MGI: 2415973" 
/db_xref="taxon: 10090" 
/clone="C230074H09" 
/tissue_type=" cerebellum" 

/clone__lib.= "RIKEN . full-length enriched mouse cDNA . library" 
/dev_stage="0 day neonate" 
154. .1053 

/note="unnamed protein product; MEGF11 PROTEIN (KIAA1781) 

homolog [Homo sapiens] ( SPTR I Q96KG6, evidence: FASTY, 

85.9%ID, 85.4%length, match=2363) 

putative " 

/codon_start=l 

/protein_id="BAC33471 . 1" 

/db_xref="GI : 26339600" 

/ translation- "MAPSAVGLLVFLLQAALAIjNPEDPNVCSHWESYAVTVQESYAHP 
FDQIYYTRCADILNWFKCTRHRISYKTAYRRGLRTMYRRRSQCCPGYYENGDFCIPLC 
TEECMHGRCVSPDTCHCEPGWGGPDCSSGCDSEHWGPHCSNRCQCQNGALCNPITGAC 
VCAPGFRGWRCEELCAPGTHGKGCQLLCQCHHGASCDPRTGECLCAPGYTGVYCEELC 
PPGSHGAHCELRCPCQNGGTCHHITGECACPPGWTGAVCAQPCPPGTFGQNCSQDCPC 
HHGGQCDHVTGQCHCTAGYMGDR" 



ORIGIN 



Query Match 13.7%; 
Best Local Similarity 72.2%; 
Matches 611; Conservative 



Score 470; DB 11; Length 2944; 
Pred. No. 2.3e-122; 
0; Mismatches 235; Indels 0; 



Gaps 



0; 



Qy 

Db 

Qy 

Db 

Qy 

Db 



74 CT CT GAAT CTT GAAGACCCT AAT GTGT GT AGCCACT GGGAAAGCTACT CAGT GACT GT GC 133 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 

206 CTCTGAACCCTGAAGACCCCAATGTGTGTAGCCACTGGGAGAGCTATGCCGTGACTGTGC 265 

134 AAGAGT CAT AC C C AC AT C C C T TT GAT C AAAT T TACT AC AC GAGCT GC ACT G AC AT T CT AA 193 

I I 1 I II II I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I II I I I 
266 AGGAGT CTT AT GCACACCCCTTT GAT CAGAT CTACT ACACACGAT GT GCAGACAT C CT CA 325 

194 ACTGGTTTAAATGCACGCGGCACAGAGTCAGCTATCGGACAGCCTATCGACATGGGGAGA 253 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

326 ACT GGT T C AAGT GT AC C C GG CAC C GGAT C AG CT ATAAGAC C GC GTAT AGG CGCGGCCTCC 385 



Qy 



254 AGACT AT GT AT AG GC GCAAGT CT CAGT GTT GT C CT GGATT T TAT GAAAGC GGGGAAAT GT 313 



Db 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

386 GGAC CAT GT AC C GGC G GAGGT C C CAAT GCTGCCCTGG CT ACT AT GAGAAC GGAGACT T CT 445 



Qy 314 GTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCTCCAAACACCTGTC 373 

I I I I I I I I I II III I I I I I I I I I I I I I I I I I I II I I I 

Db 44 6 GCATTCCTCTGTGTACCGAGGAGTGCATGCACGGCCGCTGTGTCTCTCCCGATACCTGCC 505 

Qy 374 AGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGATGGTGATCACTGGG 433 

I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 506 ACTGTGAGCCTGGATGGGGAGGCCCTGACTGCTCCAGCGGCTGTGACAGCGAGCACTGGG 565 

Qy 434 GTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGCAACCCCATCACCG 4 93 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 566 GTCCCCACTGCAGCAACCGGTGTCAGTGTCAGAACGGCGCCCTGTGCAACCCTATCACCG 625 

Qy 494 GGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGACCGCTGTGAGCAGG 553 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 626 GCGCCTGCGTGTGCGCCCCGGGCTTCCGAGGCTGGCGCTGTGAGGAACTCTGCGCTCCTG 685 

Qy 554 GCACCTATGGTAACGACTGTCATCAGAGATGCCAGTGCCAGAATGGAGCCACCTGCGACC 613 

III I I I II I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I 

Db 68 6 GTACTCACGGCAAGGGCTGCCAGCTGCTCTGTCAGTGCCACCATGGCGCCAGCTGTGACC 745 

Qy 614 ACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTCTGTGAGGATCTTT 673 

I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

Db 74 6 CGCGCACTGGCGAGTGCCTCTGCGCTCCTGGCTACACAGGCGTTTACTGTGAGGAGCTGT 805 

Qy 67 4 GT CCTCCT GGTAAACAT GGT CCACAGTGT GAGCAGAGAT GCCCT T GT CAAAAT GGAGGAG 733 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 806 GCCCCCCTGGGAGCCATGGAGCTCACTGTGAGCTGCGCTGCCCCTGCCAGAATGGAGGCA 865 

Qy 734 T GT GT CAT C ACGT CACTG GAGAATGCT CT T GC C CTT CT GGCT G GAT GGG C AC AGT GT GT G 793 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 866 CCTGCCACCACATCACTGGCGAATGTGCCTGCCCTCCAGGCTGGACGGGAGCAGTGTGTG 925 

Qy 794 GTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAATGCCAGTGCCATA 853 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 92 6 CCCAGCCCTGCCCTCCAGGGACCTTTGGCCAGAACTGTAGCCAGGACTGTCCCTGCCACC 985 

Qy 854 AT GGAGGGACGT GT GAT GCT GC CACAGGC CAAT GT C ATT GC AGT C CAGGAT ACACAGGGG 913 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 98 6 AT GGAGGC CAGT GT GACC AT GT GACT GGACAAT GC C ACT GT ACAG C T GGAT ACAT G GGGG 1045 

Qy 914 AACGGT 919 

I III 

Db 1046 ACAGGT 1051 



RESULT 13 

BG828819 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 



BG828819 755 bp mRNA linear EST 22-MAY-2001 

602751395F1 NIH_MGC_17 Homo sapiens cDNA clone IMAGE : 4 904255 5', 
mRNA sequence. 
BG828819 

BG82 8819. 1 GI: 14176418 
EST. 

Homo sapiens (human) 



ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 755) 

NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: ATCC 
cDNA Library Preparation: Ling Hong/Rubin Laboratory 
cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by: Incyte Genomics, Inc. 

Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
http : / /image . llnl . gov 
Plate: LLCM1803 row: m column: 24 
High quality sequence stop: 707. 

Location/Qualifiers 

1. .755 

/organism="Homo sapiens" 

/mol_type="mRNA M 

/db_xref="taxon:9606" 

/clone="IMAGE: 4904255" 

/ tissue_type=" rhabdomyosarcoma " 

/lab_host="DH10B (phage-resistant ) " 

/clone_lib="NIH_MGC_17" 

/note="0rgan: muscle; Vector: pOTB7; Site 1: EcoRI ; 
Site_2: Xhol; cDNA made by oligo-dT priming. 
Directionally cloned into EcoRI/XhoI sites using the 
following 5' adaptor: GGCACGAG (G) . Size-selected >500bp 
for average insert size 1.8kb. Library constructed by 
Ling Hong in the laboratory of Gerald M. Rubin (University 
of California, Berkeley) using ZAP-cDNA synthesis kit 
(Stratagene) and Superscript II RT (Life Technologies)." 



ORIGIN 



Query Match 13.3%; Score 456.6; DB 12; 

Best Local Similarity 98.8%; Pred. No. 6.7e-119; 
Matches 481; Conservative 0; Mismatches 4; 



Length 755; 
Indels 2; 



Gaps 



2; 



Qy 



Db 



1 ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 60 
I I I I I II I I I I I I I I I I I I I I I I I! II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
268 ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 327 



Qy 

Db 

Qy 

Db 



61 GGGAC AGCAT CAC CTCT GAAT CT T GAAGAC C CT AAT GT GT GT AGCC ACT GG GAAAG CT AC 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

328 GGGACAGCATCACCTCTGAATCTTGAAGACCCTAATGTGTGTAGCCACTGGGAAAGCTAC 387 



121 



180 



T C AGT G AC T GT G C AAG AGT CAT AC C CAC AT C C C T T T GAT C AAAT T T AC T AC AC GAG C T G C 
I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
388 T CAGT GACT GT GCAAGAGT CAT AC C CAC AT C C CT TT GAT CAAAT TT ACT ACAC GAGCT GC 4 47 



Qy 

Db 



181 
448 



ACT GAC ATT C TAAACT GGT TT AAAT G CAC G C GG CAC AGAGT C AGCT AT C GGACAGC CT AT 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ACT GAC ATT CTAAACT GGT T T AAAT G CAC G CGG CAC AGAGT C AGCT AT C GGACAGC CT AT 



240 
507 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



241 



508 



301 



568 



360 



628 



420 



688 



CGACAT G GGGAGAAGACT AT GT AT AGG C GCAAGT CT CAGT GT T GT CCT GGAT TT T AT GAA 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 
C GAC AT GG GG AG AAGACT AT GT AT AGG C GCAAGT CT CAGT GT T GT C C T G GAT T T TAT GAA 

AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTC-GCTGTATTGC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGGCTGTATTGC 



300 



567 



359 



627 



419 



TCCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGA 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TCCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGA 687 

TGGTGATCACTGGGGTCCCCACTGCACCAGCCGGTG-CCAGTGCAAAAATGGGGCTCTGT 478 

I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TGGTGATCACTGGAGTCCCCAATGCACCAGCCGGTGCCCAGTGCAAACATGGGGCTCTGT 747 



479 



485 



GCAACCC 

II I I I I 

748 GCCACCC 754 



RESULT 14 

BM719978 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
COMMENT 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



FEATURES 

source 



BM719978 565 bp mRNA linear EST 01-MAR-2002 

UI-E-EJ0-ahu-i-16-0-UI . rl UI-E-EJO Homo sapiens cDNA clone 
UI-E-EJ0-ahu-i-16-0-UI 5', mRNA sequence. 
BM719978 

BM719978. 1 GI: 19038910 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 565) 

Bonaldo,M. F. , Lennon,G. and Soares,M.B. 

Normalization and subtraction: two approaches to facilitate gene 
discovery 

Genome Res. 6 (9), 791-806 (1996) 

97044477 

8889548 

Contact: Soares, MB 

Coordinated Laboratory for Computational Genomics 
University of Iowa 

375 Newton Road , 4156 MEBRF, Iowa City, IA 52242, USA 

Tel: 319 335 8250 

Fax: 319 335 9565 

Email: bento-soares@uiowa.edu 

Tissue Procurement: Dr. Gregg Hageman 
cDNA Library preparation: Dr. M. Bento Soares, Univeristy of Iowa 
cDNA Library Arrayed by: Dr. M. Bento Soares, Univeristy of Iowa 
DNA Sequencing by: Dr. M. Bento Soares, Univeristy of Iowa 
Clone Distribution: Researchers may obtain clones from Research 

Genetics (www.resgen.com). 

Seq primer: M13 Reverse. 

Location/Qualifiers 
1. .565 

/organism="Homo sapiens" 



/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone="UI-E-EJ0-ahu-i-16-0-UI" 

/tissue_type="f etal eyes, lens, eye anterior segment, 
optic nerve, retina, Retina Foveal and Macular, RPE and 
Choroid" 

/dev_stage="fetal and adult" 

/lab_host="DH10B (Life Technologies) (Tl phage resistant)" 
/clone_lib="UI-E-EJO" 

/note="Organ: eye; Vector: pT7T3-Pac (Pharmacia) with a 
modified polylinker; Site 1 : EcoR I; Site_2 : Not I; 
UI-E-EJO is a subtracted cDNA library constructed 
according to Bonaldo, Lennon and Soares, Genome Research, 
6:791-806, 1996. First strand cDNA synthesis was primed 
with an oligo-dT primer containing a Not I site. Double 
stranded cDNA was ligated to an EcoR I adaptor, digested 
with Not I, and cloned directionally into pT7T3-Pac 
vector. The oligonucleotide used to prime the synthesis of 
first-strand cDNA contains a library tag sequence that is 
located between the Not I site and the (dT) 18 tail. The 
sequence tags for this library are: fetal eyes, 
AGAATCAAGA; lens, CGAT TAGCGA ; eye anterior segment, 
AATGCCGCAT; optic nerve, CCATTAAGTG; retina, CCGCG; Retina 
Foveal and Macular, GTCC; RPE and Choroid, ACCTA. This 
library was created for the program, Gene Discovery in the 
Visual System, supported by National Eye Institute (NET)." 



ORIGIN 



Query Match 12.6%; Score 433; DB 12; Length 565; 

Best Local Similarity 100.0%; Pred. No. 3.3e-112; 

Matches 433; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

ACC C AGT GT GC CACAT C C CCT CAC GT CAACAAC AG GGACAGGAT GACT GT CACGAAGT CA 2859 

I I II I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I 

ACC CAGT GT GC CACAT C C C CT CAC GT CAACAAC AGGGACAGGAT GACT GT CAC GAAGT CA 6 0 

AAAAACAAT C AACT GT T T GT GAAT CT T AAAAAT GT GAAC C CT GGG AAGAGAG GC C CT GT G 2919 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AAAAACAAT CAACT GTTT GT GAAT CT TAAAAAT GT GAAC CCT GGGAAGAGAGGCCCT GTG 120 

GGGGACTGCACTGGGACATTGCCGGCTGACTGGAAACATGGCGGCTACCTCAACGAGCTC 2979 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

GGGGACTGCACTGGGACATTGCCGGCTGACTGGAAACATGGCGGCTACCTCAACGAGCTC 180 

GGT GCT TTT GGACT TGACAGAAGCT AT AT G GGAAAAT C CT T AAAAGAC CT GGGAAAGAAT 3039 
I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GGT GCT T TT GGACT T GACAGAAG CT AT AT GG GAAAAT CCT T AAAAGAC CT GGGAAAGAAT 24 0 

T CT GAATATAAT T CAAGT AACT GC T C C CT AAGC AGTT CT GAGAACC C AT AT GC C ACT ATT 3099 
I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I 
T CT GAAT AT AATT CAAGT AACT GCT CCCTAAGCAGTT CT GAGAACCCAT AT GCCACT ATT 300 

AAAGAC CCAC CT GTACTT AT CCCGAAAAGCT CAGAGTGT GGTT AT GT GGAGAT GAAAT CG 3159 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 
AAAGAC C CAC C T GT AC T TAT C C C GAAAAG C T C AG AGT GT G G T TAT GTG G AGAT GAAAT C G 360 



QY 


2800 


Db 


1 


Qy 


2860 


Db 


61 


Qy 


2920 


Db 


121 


Qy 


2980 


Db 


181 


Qy 


3040 


Db 


241 


Qy 


3100 


Db 


301 



QY 



3160 CCGGCACGAAGAGATTCCCCATATGCAGAGATCAATAACTCAACTTCAGCCAACAGGAAT 3219 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 361 C C GGCACGAAGAGAT T CC C CAT AT GCAGAGAT CAAT AACT CAACT T CAG CCAACAGGAAT 420 



Qy 322 0 GTCTATGAAGTTG 3232 

I I I I I I I I I I I I I 
Db 421 GTCTATGAAGTTG 4 33 



RESULT 15 
BM676825/c 

LOCUS BM676825 598 bp mRNA linear EST 27-FEB-2002 

DEFINITION UI-E-EJ0-ahu-i-16-0-UI.s2 UI-E-EJO Homo sapiens cDNA clone 

UI-E-EJ0-ahu-i-16-0-UI 3', mRNA sequence. 
ACCESSION BM676825 

VERSION BM676825.1 GI: 18986721 

KEYWORDS EST. 

SOURCE Homo sapiens (human) 

ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 (bases 1 to 598) 

AUTHORS Bonaldo, M. F. , Lennon,G. and Soares,M.B. 

TITLE Normalization and subtraction: two approaches to facilitate gene 

discovery 

JOURNAL Genome Res. 6 (9), 791-806 (1996) 
MEDLINE 97044477 
PUBMED 8 889548 
COMMENT Contact: Soares, MB 

Coordinated Laboratory for Computational Genomics 
University of Iowa 

375 Newton Road , 4156 MEBRF, Iowa City, IA 52242, USA 

Tel: 319 335 8250 

Fax: 319 335 9565 

Email: bento-soares@uiowa.edu 

Tissue Procurement: Dr. Gregg Hageman 
cDNA Library preparation: Dr. M. Bento Soares, Univeristy of Iowa 
cDNA Library Arrayed by: Dr. M. Bento Soares, Univeristy of Iowa 
DNA Sequencing by: Dr. M. Bento Soares, Univeristy of Iowa 
Clone Distribution: Researchers may obtain clones from Research 

Genetics (www.resgen.com). 

Seq primer: M13 Forward 

P0LYA=Yes . 

FEATURES Location/Qualifiers 
source 1. .598 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone="UI-E-EJ0-ahu-i-16-0-UI" 

/tissue_type=" fetal eyes, lens, eye anterior segment, 
optic nerve, retina, Retina Foveal and Macular, RPE and 
Choroid" 

/dev_stage="f etal and adult" 

/lab_host="DH10B (Life Technologies) (Tl phage resistant)" 
/clone_lib="UI-E-EJO" 

/note="0rgan: eye; Vector: pT7T3-Pac (Pharmacia) with a 
modified polylinker; Site_l : EcoR I; Site_2 : Not I; 
UI-E-EJO is a subtracted cDNA library constructed 



according to Bonaldo, Lennon and Soares, Genome Research, 
6:791-806, 1996. First strand cDNA synthesis was primed 
with an oligo-dT primer containing a Not I site. Double 
stranded cDNA was ligated to an EcoR I adaptor, digested 
with Not I, and cloned directionally into pT7T3-Pac 
vector. The oligonucleotide used to prime the synthesis of 
first-strand cDNA contains a library tag sequence that is 
located between the Not I site and the (dT) 18 tail. The 
sequence tags for this library are: fetal eyes, 
AGAATCAAGA; lens, CGATTAGCGA; eye anterior segment, 
AATGCCGCAT; optic nerve, C CAT T AAGT G ; retina, CCGCG; Retina 
Foveal and Macular, GTCC; RPE and Choroid, ACCTA. This 
library was created for the program, Gene Discovery in the 
Visual System, supported by National Eye Institute (NEI) . 
TAG_TISSUE=human fetal eyes 
TAG_LIB=UI-E-EJ0 
TAG SEQ-AGAATCAAGA" 



ORIGIN 



Query Match 12.6%; Score 432; DB 12; Length 598; 

Best Local Similarity 98.9%; Pred. No. 6.6e-112; 

Matches 435; Conservative 0; Mismatches 5; Indels 0; Gaps 0; 

Qy 27 93 CAC G CT CAC C CAGT GT GCCACAT C C CCT CAC GT CAACAACAGGGACAG GAT GACT GT C AC 2852 

I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 596 CAC GAGGAC C CAGT GT GC CAC AT C C CCT CAC GT CAACAACAGGGACAG GAT GACT GT CAC 537 

Qy 2853 G AAG T C AAAAAAC AAT C AAC T GT T T GT G AAT C T T AAAAAT GT GAAC C C T G G G AAG AG AG G 2912 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 536 GAAG T C AAAAAAC AAT C AAC T GT T T GT GAAT C T T AAAAAT GT GAAC C C T G G GAAG AG AG G 477 

Qy 2913 CCCTGTGGGGGACTGCACTGGGACATTGCCGGCTGACTGGAAACATGGCGGCTACCTCAA 2 972 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 476 CCCTGTGGGGGACTGCACTGGGACATTGCCGGCTGACTGGAAACATGGCGGCTACCTCAA 417 

Qy 2973 C GAGCT C GGT G CT T T T G GACT T GAC AGAAGCT AT AT GGGAAAAT CCT T AAAAGAC CT G G G 3032 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 416 CGAGCTCGGTGCTTTTGGACTTGACAGAAGCTATATGGGAAAATCCTTAAAAGACCTGGG 357 

Qy 3033 AAAGAAT T CT GAAT AT AAT T C AAGT AAC T GC T C C CT AAG CAGT T C T GAGAAC C CAT AT G C 3092 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I 
Db 356 AAAGAAT T CT GAAT AT AATT CAAGT AACT GCT CC C TAAG CAGTTT T GAGAAC C CAT AT G C 2 97 

Qy 3093 C ACT AT T AAAGAC C CAC CT GT AC TT AT C C C GAAAAG CT CAGAGT GT GGT TAT GT GGAGAT 3152 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2 96 CACTATTAAAGACCCACCTGTACTTATCCCGAAAAGCT CAGAGT GTGGTTAT GT GGAGAT 237 

Qy 3153 G AAAT C G C C G G CAC GAAG AG AT T C C C CAT AT G C AG AG AT C AAT AAC T C AAC T T C AG C C AA 3212 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 236 GAAAT C GCCG GCAC GAAGAGAT T C C CC ATAT G CAGAGAT CAATAACT CAACTT C AGC CAA 177 

Qy 3213 CAGGAAT GT CTAT GAAGTT G 3232 

I I I I I I I I I I I I I I I I I I I 
Db 176 C AG GAAT GT T TAT GAAGT T G 157 



Search completed: March 30, 2004, 08:33:24 
Job time : 8353 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 

Run on: March 30, 2004, 01:07:41 ; Search time 13015 Seconds 

(without alignments) 

11399.401 Million cell updates/sec 

US-10-092-390-1 
3423 

1 atggttatttctttgaactc gcagcagcagcagtgaatga 3423 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 



3470272 seqs, 21671516995 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



6940544 



Database 



GenEmbl : * 



1 
2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 



gb_ba : * 
gb_htg : * 
gb_in: * 
gb_om: * 
gb_ov: * 
gb_pat : * 
gb_j>h : * 
gb_pl : * 
gb__pr : * 
gb_ro : * 
gb_sts : * 
gb_sy : * 
gb_un : * 
gb_vi : * 
em_ba : * 
em_f un : * 
em_hum: * 
em_in : * 
em_mu : * 
em_om : * 
em_or : * 
em_o v : * 
em pat : * 
em_ph : * 
em_pl : * 
em_ro : * 
em sts : * 



28 


em un: + 


29 


em vi : * 


30 


em_htg_hum: * 


31 


em_htg_inv: * 


32 


em htg other:* 


33 


em htg mus : * 


34 


em htg pin: * 


35 


em htg rod: * 


36 


em_htg_mam: * 


37 


em htg vrt : * 


38 


em s y : * 


39 


em htgo hum: * 


40 


em htgo mus : * 


41 


em htgo other: 



Pred. No, is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 



1 


3421.4 


100. 


0 


7522 


6 


BD185216 


BD185216 Novel gen 


2 


3421.4 


100. 


0 


7522 


9 


AB058676 


AB058676 Homo sapi 


3 


2274.8 


66. 


5 


2952 


9 


AK123568 


AK123568 Homo sapi 


4 


1692.4 


49. 


4 


2267 


9 


BC020198 


BC020198 Homo sapi 


5 


1425.8 


41. 


7 


1448 


6 


AR217554 


AR217554 Sequence 


6 


1425.8 


41. 


7 


1448 


6 


BD080296 


BD080296 Tango-71, 


7 


1200.6 


35. 


1 


5702 


9 


AB058677 


AB058677 Homo sapi 


8 


1040.8 


30. 


4 


5278 


10 


AK122555 


AK122555 Mus muscu 


9 


742 


21. 


7 


3281 


9 


HSM805375 


AL834326 Homo sapi 


10 


649.4 


19. 


0 


4290 


10 


AF444274 


AF444274 Mus muscu 


11 


648.6 


18. 


9 


4539 


10 


AF461685 


AF4 61685 Mus muscu 


12 


646.2 


18. 


9 


4482 


10 


AF440279 


AF44 0279 Mus muscu 


13 


600.6 


17. 


5 


3574 


6 


AX492979 


AX492979 Sequence 


14 


548.2 


16. 


0 


660 


10 


RATORFD 


L41686 Rattus norv 


15 


534.8 


15. 


6 


632 


6 


AX079681 


AX079681 Sequence 


16 


515.4 


15. 


1 


3503 


10 


BC042490 


BC042490 Mus muscu 


17 


456.8 


13. 


3 


4470 


9 


AK074121 


AK074121 Homo sapi 


18 


400.8 


11. 


7 


5523 


10 


AB011532 


AB011532 Rattus no 


19 


396.8 


11. 


6 


7334 


6 


AX817303 


AX817303 Sequence 


20 


393 


11. 


5 


4835 


6 


AX817305 


AX817305 Sequence 


21 


387 


11. 


3 


5178 


6 


AX704753 


AX704753 Sequence 


22 


386.8 


11. 


3 


4733 


6 


AX817299 


AX817299 Sequence 


23 


383.8 


11. 


2 


7319 


6 


AX817297 


AX817297 Sequence 


24 


383.8 


11. 


2 


7337 


6 


AX817295 


AX817295 Sequence 


25 


365.4 


10. 


7 


3391 


10 


BC031402 


BC031402 Mus muscu 


26 


365.4 


10. 


7 


3402 


10 


BC039980 


BC039980 Mus muscu 


27 


362.8 


10. 


6 


2629 


10 


BC058571 


BC058571 Mus muscu 


28 


354.8 


10. 


4 


4501 


9 


AB011539 


AB011539 Homo sapi 


29 


250.8 


7. 


3 


146206 


9 


AC026800 


AC026800 Homo sapi 


30 


250.8 


7. 


3 


152765 


2 


AC008566 


AC008566 Homo sapi 


31 


250.8 


7. 


3 


175144 


2 


AC010415 


AC010415 Homo sapi 


32 


250.8 


7. 


3 


217221 


9 


AC008682 


AC008682 Homo sapi 


33 


243 


7. 


1 


48202 


2 


AC012911 


AC012911 Drosophil 





34 


243 


7 . 


1 


172175 


3 


AC010038 


AC010038 


Drosophil 


c 


35 


243 


7 . 


1 


177583 


3 


AC105264 


AC105264 


Drosophil 


c 


36 


243 


7 . 


1 


257867 


3 


AC005557 


AC005557 


Drosophil 


c 


37 


243 


7 . 


1 


303191 


3 


AE003472 


AE003472 


Drosophil 




38 


241.8 


7 . 


1 


192282 


9 


AC010424 


AC010424 


Homo sapi 




39 


239.8 


7 . 


0 


2054 


9 


AK098809 


AK098809 


Homo sapi 




40 


216 


6. 


3 


3764 


3 


AF332568 


AF332568 


Caenorhab 




41 


197.2 


5. 


8 


185768 


10 


AC102173 


AC102173 Mus muscu 


c 


42 


192.4 


5. 


6 


194593 


2 


AC139989 


AC139989 


Rattus no 




43 


192.4 


5. 


6 


226956 


10 


AC095558 


AC095556 


\ Rattus no 




44 


192.4 


5. 


6 


231950 


2 


AC094759 


AC094759 


Rattus no 


c 


45 


180 


5. 


3 


209476 


2 


AC122117 


AC122117 


Mus muscu 



ALIGNMENTS 



RESULT 1 
BD185216 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



FEATURES 

source 



ORIGIN 



BD185216 7522 bp DNA linear PAT 17-JUN-2003 

Novel genes and proteins encoded by the genes. 

BD185216 

BD185216.1 GI:31877416 
JP 2002345493-A/59. 
Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 7522) 

Ohara,0., Nagase,T. and Nakajima,D. 

Novel genes and proteins encoded by the genes 

Patent: JP 2002345493-A 59 03-DEC-2002; 

KAZUSA DNA RESEARCH INSTITUTE 

OS Homo sapiens (human) 

PN JP 2002345493-A/59 

PD 03-DEC-2002 

PF 26-FEB-2002 JP 2002049046 

PI OSAMU OHARA, TAKAHIRO NAGASE, DAISUKE NAKAJIMA 
PC C12N15/09,C07K14/47,C07Kl4/54,C12N15/00 

CC Novel genes and proteins encoded by the genes FH Key 

Location/ Qualifiers 
FT CDS (48). .(3623). 

Location/Qualif iers 
1. .7522 

/organism= !, Homo sapiens" 
/mol_type=" genomic DNA" 
/db xref="taxon:9606" 
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QY 
Db 



1 ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 60 
I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I 11 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
204 ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 263 



61 GGGACAGCAT CACCTCT GAAT GTT GAAGACCCTAAT GT GT GT AGCCACT GGGAAAGCTAC 120 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

264 G GGAC AG CAT C AC CT CT GAAT CTT GAAGAC C CT AAT GT GT GT AGC CACT G GGAAAGC TAC 323 

121 T CAGT GACT GT G C AAGAGT CAT AC C C AC AT C C CT T T GAT CAAAT T TACT AC AC G AGC T GC 180 

I I 1 1 1 I It 1 1 | 1 1 1 1 1 1 1 I 1 1 I 1 1 I I I 1 1 1 I I I 1 1 1 1 I i I I I I 1 1 I I I I I I I I I I I I I I I 

324 T CAGT GACT GT GCAAGAGT CAT AC C CACAT C CCTT T GAT C AAATT T ACT ACACGAGC T GC 383 

181 ACT GACAT TCT AAACT GGT TT AAAT GCACGCGGCACAGAGT CAGCT AT CGGACAGCCT AT 240 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

384 ACT GACAT TCT AAACT GGT TT AAAT GCACGC GG CACAGAGT CAGCT AT C G GACAGCCTAT 443 

241 C GACAT GGGGAGAAGACTATGTAT AGGCGCAAGT CTCAGTGTTGT CCT GGATTTTAT GAA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

444 C GACAT GGGGAGAAGACT AT GT AT AGGC GCAAGT CT C AGT GT T GT C CT GG AT TT T AT GAA 503 

301 AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCT 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II 

504 AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCT 563 

361 CCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGAT 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II 

564 CCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGAT 623 

421 GGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGC 48 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

624 GGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGC 683 

4 81 AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 540 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

684 AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 743 

541 CGCT GT GAGC AGGGCAC CT AT GGT AAC GACT GT CAT C AGAGAT GC CAGT GC CAGAAT GGA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

744 CGCT GT GAG CAGGGCAC CTAT GGT AAC GACT GT CAT C AGAGAT GC CAGT GC CAGAAT G GA 803 

601 GCCAC CT GC GAC CAC GT CAC G GGG GAAT GC C G CT GCC C AC C AGGAT AC AC C GGAGC CT T C 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I 

804 GCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTC 8 63 

661 T GT GAGGAT CTTTGTCCTCCT GGT AAACAT GGT C CAC AGT GT GAGCAGAGAT GC CCT T GT 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

864 T GT GAGGAT CTTTGTCCTCCT GGT AAACAT GGT C CACAGT GT GAG CAGAGAT GC C CT T GT 923 

721 CAAAATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGATG 78 0 

I I I I I I I I I I I I II I I I II II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

924 CAAAAT GGAGGAGT GT GT CAT CAC GT CACT GGAGAAT GCTCTTGCCCTTCT GGCT GGAT G 983 

781 GGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAA 840 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

984 GGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAA 1043 

841 T GC CAGT GC CAT AATG GAGG GAC GT GT GAT G CT GCC ACAGGCCAAT GT CAT T GCAGT C CA 900 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1044 T GC CAGT GC CAT AAT G GAG GGAC GT GT GAT GCT GC CACAG G C CAAT GT CAT T G CAGT C C A 1103 

901 GGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGT 960 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 1104 GGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGT 1163 

Qy 961 G CT GAGAC CT GC C AGT GT GT CAACGGAG GGAAGT GT TAC C AC GT GAG C GGC G CAT G C C T C 102 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 1164 G CT GAGAC C T GC C AGT GT GT C AAC GGAGGGAAGT GT TAC C AC GT GAG C GGC GC AT G C C T C 1223 

Qy 1021 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 1080 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I > 

Db 1224 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 12 83 

Qy 1081 GGCATCAAAT GT GACAAAC GGT GT C C CT GC CACT T GGAAAACACT CAT AGC T GT CAC C C C 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 12 84 GG CAT CAAAT GT GACAAAC GGT GT C C CT GC C ACT T GGAAAACACT C AT AG CT GT CAC C C C 1343 

Qy 1141 AT GTCT GGAGAGT GT GCCT GCAAGCCGGGCT GGT C AGGACT CTACT GT AAT GAGACAT GT 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I 
Db 1344 AT GT CT GGAGAGTGT GC CT G CAAGC CG G GCT GGT CAG GACT CTACT GT AAT GAGACAT GT 14 03 

Qy 1201 T CT C CT GGAT T CTAC GG GGAAGCT T GC CAGCAGAT CT GCAG CT GC CAAAAT G GG G CAGAC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1404 TCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGAC 1463 

Qy 1261 T GT GACAGT GTGACT GGAAAGT GCAC CT GT GCC C CAGGATT CAAAGGAATTGACT GCT CT 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I ' 
Db 1464 T GT GACAGT GT GACT GGAAAGT GCAC CT GT GCC C CAGGAT T CAAAGGAATT GACT GCT CT 1523 

Qy 1321 ACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAAT 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 1524 ACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAAT 1583 

Qy 1381 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 1440 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1584 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 1643 

Qy 1441 GACT GCT C CAT C AGAT GT C C C AGT GG C ACAT GGGGCTTT GGCT GT AACT T AAC AT G C CAG 1500 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1644 GACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAG 1703 

Qy 1501 TGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 1704 TGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 1763 

Qy 1561 CGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAG 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1764 CG C GGGGAGAAAT GC GAACT T CC CT GC CAGGAT G G CAC GT AC GG GCT GAACT GT G CT GAG 1823 

Qy 1621 CGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTC 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I II I I 

Db 1824 CGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTC 18 8 3 

Qy 1681 CCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAAC 1740 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1884 CCCGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAAC 194 3 

Qy 1741 TGCTCCCTGCCCTGCTACTGTAAA7\ATGGGGCTTCATGCTCCCCTGATGATGGCATCTGC 1800 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
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Db 
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Db 
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Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1944 TGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGC 2003 
1801 GAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTAT 1860 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

2004 GAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTAT 2063 

1861 GGGC AT C GCT GC AGC C AGAC AT G C C CAC AGT GC GT T C AC AG C AG C G G GC C CT GC C AC C AC 1920 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2064 GGGCATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCAC 2123 

1921 ATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGT 1980 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I J I I I I I I I I I I I I I I I I I I I I I I I I 

2124 ATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGT 2183 

1981 CCCAGT GGCAGATTT GGGAAAAACTGT GCAGGAATTTGTACCT GCACCAACAACGGAACC 2040 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2184 C C C AG T G G C AG AT T T G G G AAAAAC T GT G C AG G AAT T T GT AC C T G CAC C AAC AAC G G AAC C 2243 

2041 TGTAACCCCATTGACAGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCT 2100 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2244 TGTAACCCCATTGACAGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCT 2303 

2101 C AAC CAT GT C C AC CT G C C C ACT GGG G C C C AAACT G CAT C CAC AC GT G CAACT GC C AT AAT 2160 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2304 C AAC CAT GT C CAC C T G C C CAC T G G G G C C C AAAC T G CAT C CAC AC GT G C AAC T G C CAT AAT 2363 

2161 GGAGCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTC 2220 

I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2364 GGAGCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTC 



2423 



2221 TACT GC ACT CAGAGAT GT C CT CTAGG GTT T TAT G GAAAAGAT T GT GCACT GAT AT GC CAA 22 80 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 r 

242 4 T AC T GCACT CAGAGAT GT C CT CT AGGGTT T T AT GGAAAAGAT T GT GCACT GAT AT GC CAA 2 4 83 

2281 T GT CAAAACGGAGCT GACT GCGACCACATTT CT GGGCAGT GTACTT GC CGCACT GGATT C 2340 

I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2484 T GT CAAAACGGAGCT GACT GCGACCACATTT CT GGGCAGT GTACTT GCCGCACT GGATT C 2543 

2341 AT GGGACGGCACT GT GAGCAGAAGT GCCCTT CAGGAACAT AT GGCT AT GGCT GT CGC CAG 24 00 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
2544 AT GG GACGGC AC TGT GAGCAGAAGT GCC C T T CAGGAACAT AT GGCT AT GGCT GT C GC CAG 2603 

2401 AT AT GT GAT T GT CT GAAC AACT C CAC CT G C GAC CAC AT C ACT GGGAC CT GTT ACT GC AGC 2460 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I 
2604 AT AT GT GAT T GT CT GAAC AACT C CAC CT G C GAC CAC AT C ACT GGGAC CT GTT ACT GCAGC 2 663 

2461 C C CGGATGGAAGGGAGC GAGAT GT GAT CAAGC T GGT GTT AT CAT AGT T G GAAAT CT GAAC 2520 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

2664 CCCGGATGGAAGGGAGCGAGATGTGATCAAGCTGGTGTTATCATAGTTGGAAATCTGAAC 2723 

2521 AGCTTAAGCCGAACCAGTACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCA 2580 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I 

2724 AGCTTAAGCCGAACCAGTACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCA 2783 

2581 G GCAT CAT C ATT CTT GT C CT AGTT GT TCT CTT C CT ACT G GCAT T GT T CAT T AT TT AT AGA 2640 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2784 GGCATCATCATTCTTGTCCTAGTTGTTCTCTTCCTACTGGCATTGTTCATTATTTATAGA 2 8 43 
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Db 
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Db 
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Db 
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Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



2641 CACAAGCAGAAGGGAAAGGAAT CAAGCAT G C CAGCAGTT AC CT ACAC CC C T GCT AT GAGG 2700 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2844 CACAAGCAGAAGGGAAAGGAATCAAGCATGCCAGCAGTTACCTACACCCCTGCTATGAGG 2903 

2701 GT CGTCAAT GCAGATTATAC CATTTCAGGAACC CTTCCT CACAGCAATGGTGGAAAC GCT 2760 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2904 GTCGTCAATGCAGATTATACCATTTCAGGAACCCTTCCTCACAGCAATGGTGGAAACGCT 2963 

2 761 AAT AG C C ACT ACT T C AC C AAT C C C AGT T AC C AC AC GCT C AC C C AGT GT GC C AC AT C C C CT 2820 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
AATAGC CACTACTT CAC CAAT CC CAGT T AC CAC ACGCT CAC C CAGT GT GC CAC AT C C C CT 



2964 



3023 



2 821 CAC GT CAACAACAGGGACAG GAT GACT GT CAC GAAGT CAAAAAACAAT CAACT GTT T GT G 288 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

3024 CACGTCAACAACAGGGACAGGAT GACT GTCACGAAGT CAAAAAACAATCAACT GTTTGTG 3083 

2881 AAT CTTAAAAAT GTGAACC CT GGGAAGAGAGGCCCT GT GGGGGACT GCACTGGGACATT G 2940 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II 

3084 AAT CTTAAAAAT GT GAACCCT GGGAAGAGAGGCC CTGT GGGGGACTGCACT GGGACATT G 3143 

2 941 CCGGCTGACTGGAAACATGGCGGCTACCTCAACGAGCTCGGTGCTTTTGGACTTGACAGA 3000 

I I I I I I I I I I I I I I l-l I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I 
3144 CCGGCTGACTGGAAACATGGCGGCTACCTCAACGAGCTCGGTGCTTTTGGACTTGACAGA 3203 

3001 AG CTAT AT GGGAAAAT C CTTAAAAGAC CTGG GAAAGAATT CT GAAT ATAAT T CAAGT AAC 3060 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

3204 AG CTAT AT GGGAAAAT C CTTAAAAGAC CT G GGAAAGAATT C T GAAT AT AATT CAAGT AAC 3263 

3061 TGCTCCCTAAGCAGTTCTGAGAACCCATATGCCACTATTAAAGACCCACCTGTACTTATC 312 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3264 TGCTCCCTAAGCAGTTCTGAGAACCCATATGCCACTATTAAAGACCCACCTGTACTTATC 3323 

3121 CCGAAAAGCT CAGAGT GTGGTTAT GTGGAGATGAAAT C GCCGGCACGAAGAGATT CCCCA 3180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
3324 CCGAAAAGCTCAGAGTGTGGTTATGTGGAGATGAAATCGCCGGCACGAAGAGATTCCCCA 3383 

3181 TAT GCAGAGAT CAATAACTCAACTTCAGCCAACAGGAAT GT CTATGAAGTT GAACCTACA 3240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3384 TAT GCAGAGAT CAAT AACT CAACT T CAGCCAAC AGGAAT GT CTAT GAAGTT GAAC CT ACA 3443 

3241 GT GAGT GT T GT C C AAG GAGT AT T CAG CAAT AAT GGGCGTCTCTCC C AG GAT C CAT AT GAC 3300 

I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I II I I I I II I I 

3444 GT GAGT GTT GT C CAAG GAGT ATT CAGCAAT AAT GGGC GT CT CT C C CAGGAT C CAT AT GAC 3503 

3301 C T C C C AAAG AAC AG T CAC AT C C C T T GT CAT TAT GAC CT G C T G C CAGT C C GAG AC AGT T C A 3360 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

3504 C T C C C AAAG AAC AG T CAC AT C C C T T GT CAT TAT GAC C T G C T G C CAGT C C GAG AC AGT T C A 



3563 



3361 



3623 



T C CTC C C CT AAGCAAGAGGACAGT GGAGGTAGC AGCAGCAACAG CAGCAG CAGCAGT GAA 3420 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

3564 T C CT CCCCT AAGCAAGAGGACAGT GGAGGTAGC AGCAG CAACAG CAGCAG CAGCAGT GAA 

3421 TGA 3423 
III 

3624 TGA 3626 



RESULT 2 
AB058676 

LOCUS AB058676 7522 bp mRNA linear PRI 10-MAY-2001 

DEFINITION Homo sapiens mRNA for MEGF10 protein (KIAA1780) , complete cds . 
ACCESSION AB058676 

VERSION AB058676.1 GI:14017776 

KEYWORDS 

SOURCE Homo sapiens (human) 

ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata.; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 (sites) 

AUTHORS Nagase,T., Nakayama,M., Nakajima,D., Kikuno,R. and Ohara,0. 

TITLE Prediction of the coding sequences of unidentified human genes. XX. 

The complete sequences of 100 new cDNA clones from brain which code 
for large proteins in vitro 
JOURNAL DNA Res. 8 (2), 85-95 (2001) 
MEDLINE 21245130 
PUBMED 11347906 
REFERENCE 2 (bases 1 to 7522) 

AUTHORS Nakayama,M., Nagase,T., Naka j irna, D . , Kikuno,R. and Ohara,0. 
TITLE Direct Submission 

JOURNAL Submitted ( 27-MAR-2 001 ) Manabu Nakayama, Kazusa DNA Rsearch 
Institute, Department of Human Gene Research; 1532-3, Yana, 
Kisarazu, Chiba 292-0812, Japan (E-mail : nmanabu@kazusa . or . jp, 
URL :http: //www. kazusa. or. jp/huge, Tel : 81-4 38-52-3915, 
Fax:81-438-52-3914) 
FEATURES Location/Qualifiers 
source 1. .7522 

/organism="Homo sapiens" 

/mol_t ype= "mRNA" 

/db_xref="taxon: 9606" 

/clone= M pf01012" 

/ tissue_type="hippocampus" 

/dev_stage="adult" 

/note="vector ipBluescript II SK plus" 
gene 1. .7522 

/gene="MEGF10" 
CDS 204. .3626 

/gene="MEGF10" 

/note="KIAA1780 protein 

gene encoding protein with multiple EGF-like-domains " 
/ codon_start=l 

/product="MEGF10 protein (KIAA1780) " 
/protein_id="BAB47409. 1" 
/db_xref="GI: 14017777" 

/translation- "MVISLNSCLSFICLLLCHWIGTASPLNLEDPNVCSHWESYSVTV 
QESYPHPFDQIYYTSCTDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESG 
EMCVPHCADKCVHGRCIAPNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALC 
NPITGACHCAAGFRGWRCEDRCEQGTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTG 
AFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKN 
CSQECQCHNGGTCDAATGQCHCSPGYTGERCQDECPVGTYGVLCAETCQCVNGGKCYH 
VSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPCHLENTHSCHPMSGECACKPGWS 
GLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTCAPGFKGIDCSTPCPLGTYGI 
NCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGTWGFGCNLTCQCLNGGACN 
TLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGCHPTTGHCRCLPGWSGV 



HCDSVCAEGRWGPNCSLPCYCKNGASCSPDDGICECAPGFRGTTCQRICSPGFYGHRC 
SQTCPQCVHSSGPCHHITGLCDCLPGFTGALCNEVCPSGRFGKNCAGICTCTNNGTCN 
PIDRSCQCYPGWIGSDCSQPCPPAHWGPNCIHTCNCHNGAFCSAYDGECKCTPGWTGL 
YCTQRCPLGFYGKDCALICQCQNGADCDHISGQCTCRTGFMGRHCEQKCPSGTYGYGC 
RQI CDCLNNSTCDHITGTCYCS PGWKGARCDQAGVI I VGNLNSLSRT STALPADS YQI 
G AI AG III LVLWL F L L AL F 1 1 Y RH KQ KG KE S SM P AVT YT P AMRWNAD YT I S GT L P H 
SNGGNANSHYFTNPSYHTLTQCATSPHVNNRDRMTVTKSKNNQLFVNLKNVNPGKRGP 
VGDCTGTLPADWKHGGYLNELGAFGLDRSYMGKSLKDLGKNSEYNSSNCSLSSSENPY 
AT I KD P P VL I P K S S EC G YVEMKS PARRD S P YAE I NN S T S AN RNVYEVE P T VS WQGVF 
SNNGRLSQDPYDLPKNSHIPCHYDLLPVRDSSSSPKQEDSGGSSSNSSSSSE" 

ORIGIN 



Query Match 100.0%; Score 3421.4; DB 9; Length 7522; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 3422; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 


l 


ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 


60 




1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 t 1 1 1 M 1 1 1 1 I II 1 1 1 1 III 




Db 


204 


ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 


263 


QY 


61 


G GGACAGC AT CAC CT CT GAAT CTT GAAGAC C CT AAT GTGT GTAGCCACT GGGAAAGCT AC 


120 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


264 


GGGACAGCAT CACCTCT GAATCTT GAAGACCCTAAT GTGT GTAGCCACT GGGAAAGCTAC 


323 


Qy 


121 


TCAGT GACT GTGCAAGAGT CATACCCACAT CC CTTT GAT CAAATTTACTACAC GAGCTGC 


18 0 




II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 M 1 1 1 




Db 


324 


TCAGTGACTGTGCAAGAGTCATACCCACATCCCTTTGATCAAATTTACTACACGAGCTGC 


383 


Qy 


181 


ACTGACATTCTAAACTGGTTTAAATGCACGCGGCACAGAGTCAGCTATCGGACAGCCTAT 


240 


Db 


384 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Ml 1 1 1 1 1 1 1 

AC T GACATT CT AAACT GGT T TAAAT GC AC GCG GCACAGAGT CAGCT AT C GGACAGC CT AT 


443 


Qy 


241 


C GACAT GG GGAGAAGACTAT GT ATAGGC GCAAGT CT CAGT GT T GTC CT G GATT T TAT GAA 






I I 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


444 


C GACAT GGGGAGAAGACT AT GT AT AGGC GCAAGT CT CAGT GTT GT C CT GGATT T TAT GAA 


503 


Qy 


301 


AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCT 


360 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


504 


AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCT 


563 


Qy 


361 


CCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGAT 


420 




I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 M 1 1 1 1 1 1 I 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 




Db 


564 


CCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGAT 


623 


Qy 


421 


GGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGC 


480 




I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


624 


GGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGC 


683 


Qy 


481 


AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 


540 






1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 M II 1 M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 




Db 


684 


AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 


743 


Qy 


541 


CGCT GT GAGCAGGGCACCTAT GGTAACGACT GT CAT CAGAGAT GCCAGT GCCAGAAT GGA 


600 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


744 


CGCTGTGAGCAGGGCACCTATGGTAACGACTGTCATCAGAGATGCCAGTGCCAGAATGGA 


803 


Qy 


601 


GCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTC 


660 



1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 8 04 GCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTC 863 

Qy 661 T GT GAG GAT CT T TGTCCTCCT GGT AAAC AT GGT C CACAGT GT GAGC AGAGAT GC C CT T GT 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 8 64 TGTGAGGATCTTTGTCCTCCTGGTAAACATGGTCCACAGTGTGAGCAGAGATGCCCTTGT 923 

Qy 721 CAAAATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGATG 7 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 924 CAAAAT GGAGGAGT GT GT CAT CAC GT CACT GGAGAAT GCTCTTGCCCTTCT GGCT GG AT G 983 

Qy 781 GGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAA 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 984 GG CACAGT GT GT GGT CAGCCTT GC C CC GAGG GT C GCT T T G GAAAGAACT GT T CC CAAGAA 1043 

Qy 841 TGCCAGT GCCATAATGGAGGGACGT GT GATGCTGCCACAGGCCAATGT CATT GCAGT C CA 900 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1044 T GC CAGT GCCAT AAT GGAGGGAC GT GT GAT G CT GCCACAG GCCAAT GT CAT T GCAGT C CA 1103 

Qy 901 GGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGT 960 

I I I I I I I I I I I I I I I II I I I I II I I I I I I II I I II I I I I I I I I I I I I I I II I I I I I I I I I 
Db 1104 GGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGT 1163 

Qy 961 G CT GAGAC CT G C CAGT GT GT C AAC GGAG GGAAGT GTT AC CAC GT GAGC GGC G CAT G C C T C 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 1164 GCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTC 1223 

Qy 1021 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 108 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1224 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 1283 

Qy 1081 GGC AT C AAAT GT G AC AAAC G GT GT C C CT GC CACT T G GAAAAC ACT CAT AGCT GT CAC C C C 114 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I 
Db 12 8 4 GGCAT CAAAT GT GACAAAC G GT GT C C CT GC CACT TGGAAAACACT CAT AGCT GT CAC C C C 1343 

Qy 1141 ATGT CT GGAGAGT GT GCCT GCAAGCCGGGCT GGT CAGGACT CT ACTGT AAT GAGAC AT GT .1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1344 AT GT CT GGAGAGT GT GCCT GCAAGCCGGGCT GGT CAGGACT CT ACTGT AAT GAGAC AT GT 14 03 

Qy 1201 TCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGAC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1404 TCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGAC 14 63 

Qy 1261 T GT GAC AGT GTGACT GGAAAGT G C AC CT GT G C C C CAGGAT T CAAAGGAAT T GACT GCT CT 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1464 TGT GAC AGT GT GACT GGAAAGT G C AC CT GT G CC C CAGGAT T CAAAGGAAT T GACT GCT CT 152 3 

Qy 1321 ACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAAT 138 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1524 ACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAAT 1583 

Qy 1381 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 144 0 

I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1584 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 1643 

Qy 1441 GACT GCT CC AT CAGATGTCC CAGT GGC AC AT GG GGCT TT GGCT GTAACTTAACATGCCAG 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1644 GACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAG 1703 

1501 TGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 1560 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1704 TGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 1763 

1561 CGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAG 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1764 CGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAG 1823 

162 1 CGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTC 168 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
1824 CGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTC 18 8 3 

1681 CCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAAC 17 4 0 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1884 CCCGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAAC 1943 

1741 TGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGC 1800 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1944 TGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGC 2003 

1 8.0 1 GAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTAT 1860 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2004 GAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTAT 2063 

1861 GGGCATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCAC 1920 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2064 GGGCATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCAC 2123 

1921 ATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGT 1980 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

2124 ATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGT 2183 

1981 CCCAGT GGCAGATTT GGGAAAAACT GT GCAGGAATTT GT ACCT GCACCAACAACGGAAC C 2 04 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2184 C C CAGT GG CAGATTT GGGAAAAACT GT GC AGGAAT T T GT ACCT GCAC CAACAAC GGAAC C 224 3 

2041 TGTAACCCCATTGACAGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCT 2100 

I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2244 T GT AAC CC CATT GACAGAT CTT GT CAGT GT TAC C CC GGT T GGATT G GCAGT GACT GCT CT 2303 

2101 C AAC C AT GT C C AC CT GC C C ACT G GG GC C C AAAC T G CAT C C AC AC GT GC AACT GC C AT AAT 2160 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2304 CAACCATGTCCACCTGCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAAT 2363 

2161 GGAGCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTC 2220 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2364 GGAGCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTC 2423 

2221 T ACT GCACT C AGAGAT GT C CT CT AG GGTT T TAT GGAAAAGAT T GT GCAC T GAT AT GC CAA 2280 
I I I II I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II 

2424 TACT GCACT C AGAGAT GT C CT CTAG GGTT T TAT GGAAAAGAT T GT GCAC T GAT AT GC CAA 24 83 

2281 TGTCAAAACGGAGCTGACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTC 234 0 

I I I II I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

24 8 4 TGTCAAAACGGAGCTGACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTC 254 3 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



2341 AT GGGACG GCACT GT GAGCAGAAGT GC C CT T CAGGAAC AT AT G GCT AT G GCT GT CGC CAG 24 00 

I I I I I I I I I I I I I I I I I I III I.I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 

2544 ATGGGACGGCACTGTGAGCAGAAGTGCCCTTCAGGAACATATGGCTATGGCTGTCGCCAG 2 603 

2401 AT AT GT GAT T GT C T GAACAACT C C AC CT GC GAC C AC AT C ACT G GGAC CT GT T ACT GC AGC 24 60 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2604 ATATGT GATT GT CT GAACAACT CCACCTGC GACCACATCACTGGGACCT GTTACTGCAGC 2663 

2461 C C C G GAT GGAAG GGAGC GAGAT GT GAT CAAGCT GGT GT TAT C AT AGT T GGAAAT CT GAAC 2520 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2664 C C C GGAT G GAAGGGAGC GAGAT GT GAT C AAG C T G GT GT TAT CAT AGT T G GAAAT CT GAAC 2723 

2521 AGCTTT^AGCCGAACCAGTACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCA 2580 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2724 AGCTTAAGCCGAACCAGTACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCA 2783 

2581 GGCATCATCATTCTTGTCCTAGTTGTTCTCTTCCTACTGGCATTGTTCATTATTTATAGA 2 640 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2784 GGCATCATCATTCTTGTCCTAGTTGTTCTCTTCCTACTGGCATTGTTCATTATTTATAGA 2843 

2641 CACAAGCAGAAGGGAAAGGAAT CAAGC AT GC CAGCAGTTAC CTACACCCCT GCTAT GAGG 27 00 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
2844 CACAAGCAGAAGGGAAAGGAAT CAAGCAT GCCAGCAGTTACCTACACCCCTGCTAT GAGG 2903 

2701 GTCGT CAAT GCAGATT AT AC CATTT CAGGAACCCTTCCT CACAGCAAT GGT GGAAACGCT 27 60 

M I I I I I I II I I II I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I 
2904 GT CGT CAAT GCAGATT AT AC CAT TT CAGGAAC C CT T C CT CACAGCAAT G GT GGAAAC GCT 2963 

2761 AAT AGCCACTACTT CAC CAATC C CAGTTACCACACGCT CAC CC AGT GTGCCACAT CC CCT 2820 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2964 AAT AGC CACTACTT CAC CAAT C C CAGTT AC CACAC G CT C ACC CAGT GT GC CACAT C C C C T 3023 

2821 CAC GT C AAC AAC AG G GAC AG GAT GAC T GT CAC G AAG T C AAAAAAC AAT C AAC TGTTTGTG 2 880 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3024 CACGT CAACAACAGGGACAGGAT GACTGTCACGAAGTCAAAAAACAAT CAACT GTT T GT G 3 083 

2881 AATCTTAAAAAT GTGAACC CTGGGAAGAGAGGCCCT GTGGGGGACT GCACTGGGACATT G 2 94 0 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3084 AAT CT T AAAAAT GT GAAC C CT GGGAAGAGAGGC C CT GT GG GGGAC T GCACT G GGAC ATT G 3143 

2941 CCGGCTGACTGGAAACATGGCGGCTACCTCAACGAGCTCGGTGCTTTTGGACTTGACAGA 3000 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3144 CCGGCTGACTGGAAACATGGCGGCTACCTCAACGAGCTCGGTGCTTTTGGACTTGACAGA 3203 

3001 AGCTATAT GGGAAAATCCTTAAAAGACCT GGGAAAGAATTCT GAATATAATT CAAGTAAC 3060 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3204 AGCTATAT GGGAAAAT C CT T AAAAGAC CT GGGAAAGAAT T CT GAAT AT AAT T CAAGTAAC 3263 

3061 T GCT C C CTAAG CAGT T CT GAGAAC C CAT AT G C C ACT AT T AAAGAC C CAC CT GT ACT TAT C 3120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3264 T GCT CCCTAAGCAGT T CT GAGAAC C CAT AT GC CACTAT TAAAGAC C CAC CT GT ACT TAT C 3323 

3121 C C G AAAAG C T CAG AGT GT G G T T AT GT G GAGAT GAAAT C G C C G G CAC G AAG AG AT T C C C C A 3180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3324 C C G AAAAG C T CAG AGT G T G GT TAT GT G GAGAT GAAAT C G C C G G CAC G AAG AG AT T C C C C A 3383 



Qy 


O T Q 1 

olo 1 


TATGCAGAGATCAATAACTCAAO 1 1 CAUU w\Aw\vjoAA1 (d 1 l Al (jAAj 1 1 (aAAL^ i rw^-tx 


n 

" u 




1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


3384 


TATGCAGAGATCAATAACTCAACTTCAGCCAACAGGAATGTCTATGAAGTTGAACCTACA 


3443 


Qy 


3241 


GTGAGTGTTGTCCAAGGAGTA1 1 UAijUAAl AAl tjLiUULii u l Ll UL-LAbijAl LtAiKi 


3300 




I I I I M | 1 1 1 1 I 1 1 II 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 




Db 


3444 


GTGAGTGTTGTCCAAGGAGTATTCAGCAATAATGGGCGTCTCTCCCAGGATCCATATGAC 


3503 


Qy 


3301 


CTCCCAAAGAACAGTCACATCCCT I Gl CA1 IA1 GALC1 GUI GULAG 1 L-L.GAGALAG1 I LA 






i < i i i i i i i i i i i i i i t t i i i i i i i i i i l l i l l l l ) l l l 1 1 1 1 1 1 1 l l l l l l l 1 1 1 1 1 1 1 
1 I I I I I I I I I 1 1 I I 1 I I I I 1 1 1 i 1 II 1 1 1 1 1 1 11 1 i 1 1 1 1 M ! II 1 II 1 1 M 1 M M II I 




Db 


3504 


CTCCCAAAGAACAGTCACATCCCTTGTCATTATGACCTGCTGCCAGTCCGAGACAGTTCA 


3563 


Qy 


3361 


TCCTCCCCTAAGCAAGAGGACAGTGGAGGTAGCAGCAGCAACAGCAGCAGCAGCAGTGAA 


3420 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


3564 


TCCTCCCCTAAGCAAGAGGACAGTGGAGGTAGCAGCAGCAACAGCAGCAGCAGCAGTGAA 


3623 



Qy 3421 TGA 3423 

I I I 

Db 3624 TGA 3626 
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AK123568 2952 bp mRNA linear PRI 09-SEP-2003 

Homo sapiens cDNA FLJ41574 fis, clone CTONG2010116, weakly similar 
to Rattus norvegicus mRNA for MEGF6. 
AK123568 

AK123568.1 GI: 34529147 

oligo capping; fis (full insert sequence) . 
Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 

Tanigami,A. , Fuj iwara f T . , Shibahara, T . , Goto,Y., Hirao,M., 

Shimizu,F., Wakebe,H., Ono,T., Hishigaki , H . , Watanabe,T., Ozaki,K., 

Sugiyama,T., Irie,R., Otsuki,T., Sato.H., Wakamatsu, A. , Ishii,S., 

Yamamoto,J., Isono,Y., Kawai-Hio, Y. , Saito,K., Nishikawa, T . , 

Kimura,K., Yamashita, H . , Matsuo,K., Nakamura,Y., Sekine,M. , 

Kikuchi,H., Kanda,K., Wagatsuma,M. , Murakawa,K., Kanehori,K., 

Takahashi-Fujii,A. , Oshima,A., Sugiyama,A. , Kawakami,B., Suzuki, Y., 

Sugano,S., Nagahari,K., Masuho,Y., Nagai,K. and Isogai,T. 

NEDO human cDNA sequencing project 

Unpublished 

2 (bases 1 to 2952) 

Isogai,T. and Yamamoto, J. 

Direct Submission 

Submitted ( 15- JUL-2003 ) Takao Isogai, FLJ ProjectfHRI Team); 2-6-7 
Kazusa-Kamatari, Kisarazu, Chiba 292-0818, Japan 

( E-mail : genomics @hri . co . jp, Tel : 81-438-52-3975, Fax:81-438-52-3986) 
NEDO human cDNA sequencing project supported by Ministry of 
Economy, Trade and Industry of Japan; cDNA full insert sequencing: 
Research Association for Biotechnology (RAB) ; cDNA library 
construction: Helix Research Institute (HRI) (supported by Japan 
Key Technology Center etc.); 5'- & 3 ' -end one pass sequencing: RAB, 
HRI, and Biotechnology Center, National Institute of Technology and 
Evaluation; clone selection for full insert sequencing: HRI and 



RAB; annotation: HRI and RAB. 
FEATURES Location/Qualifiers 
source 1. .2952 

/organism="Homo sapiens" 
/mol_type="mRNA M 
/db_xref="taxon:9606" 
/clone="CTONG2010116" 
/tissue__type="tongue, tumor tissue" 
/clone_lib="CT0NG2 " 
/note="cloning vector: pME18SFL3" 

ORIGIN 

Query Match 66.5%; Score 2274.8; DB 9; Length 2952; 

Best Local Similarity 99.9%; Pred. No. 0; 

Matches 227 6; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 478 ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 537 

Qy 61 GGGACAGCAT CACCTCTGAAT CTT GAAGACCCTAAT GTGT GTAGCCACTGGGAAAGCTAC 120 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db .538 GGGACAGCAT CACCTCT.GAATCTT GAAGACCCTAAT GTGT GTAGCCACTGGGAAAGCTAC 597 

Qy 121 T CAGT GACT GT GCAAGAGT CATAC CCACAT C C CT TT GAT CAAAT TT ACT AC AC GAG CT GC 180 

I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 598 T CAGT GACT GTGCAAGAGT CATAC C CACAT CCCTTT GAT CAAATTTACTACAC GAGCT GC 657 

Qy 181 ACTGACATTCT7UUVCTGGTTTAAATGCACGCGGCACAGAGTCAGCTATCGGACAGCCTAT 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 658 ACT GACATT C T AAACT G GT T TAAAT G C AC GCG GCACAGAGTCAGCT AT C G GAC AGCCT AT 717 

Qy 241 CGACAT GGGGAGAAGACTAT GTATAGGCGCAAGTCT CAGT GTT GTCCTGGATTTTAT GAA 300 

I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I > I I I I I I II I I I I I I I I I I I I I 
Db 718 CGACATGGGGAGAAGACTATGTATAGGCGCAAGTCTCAGTGTTGTCCTGGATTTTATGAA 777 

Qy 301 AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCT 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 778 AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCT 837 

Qy 361 CCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGAT 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 838 CCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGAT 897 

Qy 421 GGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGC 480 

I I I I I I I I I I I I I II I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 898 GGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGC 957 

Qy 4 81 AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 958 AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 1017 

Qy 541 C GCT GT GAGC AGG GC ACCT AT G GT AAC GAC T GT CAT C AGAGAT GCCAGT GC CAGAAT GGA 600 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I 
Db 1018 CGCT GT GAGCAGGGCACCTATGGTAACGACTGT CAT CAGAGAT GCCAGT GCCAGAAT GGA 1077 



Qy 601 GCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTC 660 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



I I I I I 1 1 II I II I I I 1 1 I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I M M M I M I 

1078 GCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTC 1137 



661 



720 



TGTGAGGATCTTTGTCCTCCTGGTAAACATGGTCCACAGTGTGAGCAGAGATGCCCTTGT 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I 

1 13 8 TGTGAGGATCTTTGTCCTCCTGGTAAACATGGTCCACAGTGTGAGCAGAGATGCCCTTGT 1197 



721 



CAAAAT GGAGGAGT GT GT CAT CACGT CACT GGAGAAT GCT CTT GCCCTT CT GGCT GGAT G 

I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 

1198 CAAAATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGATG 



781 



780 



1257 



840 



GGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAA 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

1258 GGC AC AGT GT GT GGT CAGC CTTGCCCC GAG GGT CG CT TT G GAAAGAACT GTT C C CAAGAA 1317 



841 



900 



T G C C AGT GC CATAAT GGAG G GAC GT GT GAT GCT GC C ACAG GC CAAT GT C ATT GCAGT C CA 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1318 T G C CAGT GC CATAAT G GAGGGAC GT GT GAT GCT G C C AC AG GC CAAT GT CAT T GCAGT CCA 1377 



901 



960 



GGATACACAGGGGAAC GGT GCCAGGAT GAGT GT CCT GTT GGGAC CTAT GGCGTT CT CT GT 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1378 GGAT ACACAGGGGAAC GGT GCCAGGAT GAGT GT CCT GTT GGGAC CTAT GGCGTT CT CT GT 1437 

961 GCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1438 GCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTC 1497 

1021 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 1080 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

14 98 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 1557 

1081 GGCATCAAATGTGACAAACGGTGTCCCTGCCACTTGGAAAACACTCATAGCTGTCACCCC 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1558 GGCAT CAAAT GT GACAAAC GGT GTC C CT GC CACTT G GAAAACACT C AT AGCT GT CAC C C C 1617 

1141 ATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGT 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1618 ATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGT 1677 

1201 TCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGAC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1678 TCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGAC 1737 

12 61 TGTGACAGTGTGACTGGAAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCT 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

1738 T GT GACAGT GT GACT GGAAAGT GCAC CT GT GC C C C AGGAT T CAAAG GAAT T GACT G CT C T 17 97 

1321 ACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAAT 1380 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
17 98 ACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTA^AAAT 1857 

1381 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1858 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 1917 

1441 GACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAG 1500 
| I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



1918 GACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAG 1977 

1501 TGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 1560 

I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I 

1978 TGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 2037 

1561 CGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAG 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

2038 CGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAG 2097 

1621 CGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTC 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2 098 CGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTC 2157 

1681 CCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAAC 1740 

II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2158 CCCGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAAC 2217 

1741 TGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGC 1800 

II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2218 TGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGC 2277 

.1801 GAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTAT 1860 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

2278 GAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTAT 2337 

18 61 GGGCATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCAC 192 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2338 GGGCATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCAC 2397 

1921 ATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGT 198 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2398 ATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGT 2457 

1981 CCCAGTGGCAGATTTGGGAAAAACTGTGCAGGAATTTGTACCTGCACCAACAACGGAACC 2040 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2458 CC C AGT GG C AGAT T T GGGAAAAACT GT GCAGGAATT T GT AC CT GCAC CAACAAC GGAAC C 2517 

2041 TGTAACCCCATTGACAGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCT 2100 

I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2518 T GTAAC C C CATT GACAGAT CT T GT CAGT GT T AC C CC G GTT GGAT T GGCAGT GACT GCT CT 2577 

2101 CAACCATGTCCACCTGCCCACTGGGGCCCA/^ACTGCATCCACACGTGCAACTGCCATAAT 2160 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2578 CAACCATGTCCACCTGCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAAT 2637 

2161 GGAGCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTC 222 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2 638 GGAGCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTC 2697 

2221 TACTGCACTCAGAGATGTCCTCTAGGGTTTTATGGAAAAGATTGTGCACTGATATGCC 2278 

I I I I I I I I I I II I I I I I I II I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2698 TACTGCACT CAGAGAT GT CCT CT AGGGTTTT AT GGAAAAGATT GT GCACTGAT AT GAC 2755 
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BC020198 2267 bp mRNA linear PRI 16-SEP-2003 

Homo sapiens MEGF10 protein, mRNA (cDNA clone IMAGE: 4 904255) , 
complete eels . 
BC020198 

BC020198.1 GI: 18044 365 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 2267) 

Strausberg,R.L. , Feingold, E . A. , Grouse, L.H. , Derge, J.G. , 
Klausner, R.D. , Collins, F. S . , Wagner, L., Shenmen, C .M. , Schuler, G. D. , 
Altschul, S. F. , Zeeberg,B., Buetow,K.H., Schaef er, C . F. , Bhat,N.K., 
Hopkins, R. F. , Jordan, H., Moore, T., Max,S.I., Wang, J., Hsieh,F., 
Diatchenko,L. , Marusina,K., Farmer, A. A., Rubin, G.M., Hong,L., 
Stapleton,M. , Soares,M.B. , Bonaldo,M. F. , Casavant, T . L . , 
Scheetz, T . E . , Browns tein, M. J. , Usdin, T. B. , Toshiyuki, S . , 
Carninci,P., Prange,C, Raha,S.S., Loquellano, N . A. , Peters, G. J., 
Abramson, R. D. , Mullahy, S.J. , Bosak, S . A. , McEwan, P. J. , 
McKernan, K.J. , Malek, J. A. , Gunaratne, P . H. , Richards , S . , 
Worley,K.C, Hale,S., Garcia, A. M. , Gay,L.J., Hulyk,S.W., 
Villalon,D.K. , Muzny, D . M. , Sodergren, E. J . , Lu,X., Gibbs,R.A., 
Fahey,J., Helton, E., Ketteman,M., Madan,A. , Rodrigues , S . , 
Sanchez, A. , Whiting, M. , Madan,A., Young,A.C, Shevchenko, Y . , 
Bouffard,G.G. , Blakesley, R. W . , Touchman, J. W . , Green, E.D. , 
Dickson, M.C. , Rodriguez, A. C. , Grimwood,J., Schmutz,J., Myers, R.M., 
Butterfield,Y.S. , Krzywinski , M. I . , Skalska,U. , Smailus,D.E. , 
Schnerch,A., Schein,J.E., Jones, S.J. and Marra,M.A. 
Generation and initial analysis of more than 15,000 full-length 
human and mouse cDNA sequences 

Proc. Natl. Acad. Sci. U.S.A. 99 (26), 16899-16903 (2002) 

22388257 

12477932 

2 (bases 1 to 2267) 
Strausberg, R. 
Direct Submission 

Submitted ( 19-DEC-2001 ) National Institutes of Health, Mammalian 
Gene Collection (MGC) , Cancer Genomics Office, National Cancer 
Institute, 31 Center Drive, Room 11A03, Bethesda, MD 20892-2590, 
USA 

NIH-MGC Project URL: http://mgc.nci.nih.gov 
Contact: MGC help desk 
Email : cgapbs-r@mail . nih. gov 
Tissue Procurement: ATCC 

cDNA Library Preparation: Rubin Laboratory 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Genome Sequence Centre, 

BC Cancer Agency, Vancouver, BC, Canada 

inf o@bcgsc .be . ca 

Steven Jones, Jennifer Asano, Ian Bosdet, Yaron Butterfield, 
Susanna Chan, Readman Chiu, Chris Fjell, Erin Garland, Ran Guin, 
Letticia Hsiao, Martin Krzywinski, Reta Kutsche, Oliver Lee, Soo 
Sen Lee, Victor Ling, Carrie Mathewson, Candice McLeavy, Steven 
Ness, Pawan Pandoh, Anna-Liisa Prabhu, Parvaneh Saeedi, Jacqueline 
Schein, Duane Smailus, Michael Smith, Lorraine Spence, Jeff Stott, 
Michael Thorne, Miranada Tsai, Natasja van den Bosch, Jill Vardy, 



George Yang, Scott Zuyderduyn, Marco Marra. 



FEATURES 

source 



gene 



CDS 



Clone distribution: MGC clone distribution information can be found 
through the I.M.A.G.E. Consortium/LLNL at: http://image.llnl.gov 
Series: IRAL Plate: 40 Row: o Column: 4 

This clone was selected for full length sequencing because it 
passed the following selection criteria: matched mRNA gi : 14192942 
This clone has the following problem: The cds is short compared to 
the longest cds in the locus. 

Location/Qualifiers 

1. .2267 

/organism="Homo sapiens" 
/mol_type= "mRNA" 
/db_xref="taxon:9606" 
/clone="IMAGE: 4904255" 

/tissue_type= "Muscle, rhabdomyosarcoma" 

/clone_lib="NIH_MGC_17" 

/lab_host="DH10B-R" 

/note="Vector: pOTB7" 

1. .2267 

/gene="MEGFlO" 

/note= M synonym: KIAA1780" 

/db_xref="LocusID: 84466" 

280. .1983 

/ codon_start=l 

/product="MEGF10 protein" 

/protein_id="AAH20198 .1" 

/db_xref="GI : 18044366" 

/db_xref="LocusID: 84466" 

/translation="MVISLNSCLSFICLLLCHWIGTASPLNLEDPNVCSHWESYSVTV 
QESYPHPFDQIYYTSCTDILNWFKCTRHRVSYRTAYRHGEKTMYRRKSQCCPGFYESG 
EMCVPHCADKCVHGRCIAPNTCQCEPGWGGTNCSSACDGDHWGPHCTSRCQCKNGALC 
NPITGACHCAAGFRGWRCEDRCEQGTYGNDCHQRCQCQNGATCDHVTGECRCPPGYTG 
AFCEDLCPPGKHGPQCEQRCPCQNGGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKN 
CSQECQCHNGGTCDAATGQCHCSPGYTGERCQDECPVGTYGVLCAETCQCVNGGKCYH 
VSGACLCEAGFAGERCEARLCPEGLYGIKCDKRCPCHLENTHSCHPMSGECACKPGWS 
GLYCNETCSPGFYGEACQQICSCQNGADCDSVTGKCTCAPGFKGIDCSTPCPLGTYGI 
NCSSRCGCKNDAVCSPVDGSCTCKAGWHGVDCSIRCPSGTWGFGCNLTCQCLNGGACN 
TLDGTCTCAPGWRGEKCELPCQDGTYGLNCAERCDCSHADGCHPTTGHCRCLPGWSGL 
F" 



ORIGIN 



Query Match 49.4%; 
Best Local Similarity 99.9%; 
Matches 1693; Conservative 



Score 1692.4; 
Pred. No. 0; 
0; Mismatches 



DB 9; Length 2267; 



1; Indels 



0; Gaps 



0; 



Qy 



Db 



1 ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2 80 ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 339 



Qy 

Db 

Qy 

Db 



61 GGGAC AGCAT CAC CT CT GAAT CT T GAAGAC CCT AAT GT GT GT AGC CACT GGGAAAG CT AC 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

340 GGGACAGCATCACCTCTGAATCTTGAAGACCCTAATGTGTGTAGCCACTGGGAAAGCTAC 399 

121 T CAGT GACT GT GCAAGAGT CAT ACCCACAT C C CTTT GAT CAAATTT ACT ACACGAGCT GC 18 0 

I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
400 T CAGT GACT GT GCAAGAGT CAT ACCCACAT CCCTTT GAT CAAATTTACT ACACGAGCTGC 459 



Qy 181 AC T GACAT T CT AAACT GGT TTAAAT GCAC GC GGCACAGAGT CAGCT AT C GGACAG C CT AT 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 

Db 4 60 ACT GACAT T CTAAACTGGT TTAAAT GCAC GC GGCACAGAGT CAGC T AT C GGACAGC CTAT 519 

Qy 241 C GACAT G G GGAGAAGACT AT GT AT AG GC G C AAGT C T CAGT GT T GT C C T GGAT T T TAT GAA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 520 C GACAT GGGGAGAAGACT AT GT ATAG GC G CAAGT CT CAGT GT T GT C CT G GATT TT AT GAA 579 

Qy 301 AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCT 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 580 AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCT 639 

Qy 361 CCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGAT 42 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 64 0 C CAAACAC CTGT CAGT GT GAGC CT G GCT G GGGAGGGAC CAACT GCT C CAGT G C CT GC GAT 699 

Qy 421 GGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGC 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 7 00 GGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGC 759 

Qy 4 81 AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I -I I I I I I 
Db 7 60 AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 819 

Qy 541 CGCT GT GAGCAGGGCACCTATGGTAACGACT GT CAT CAGAGAT GCCAGT GCCAGAAT GGA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 820 C G CT GT GAGCAGGGC ACCTATG GTAAC GACT GT CAT CAGAGAT G C CAGT GCCAGAAT GGA 879 

Qy 601 GCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTC 660 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 880 GC CAC CT GC GACCAC GT CAC GG GGGAAT GC C GCT G C CC AC CAGGATAC AC CGGAG C C TT C 939 

Qy 661 TGTGAGGATCTTTGTCCTCCTGGTAAACATGGTCCACAGTGTGAGCAGAGATGCCCTTGT 72 0 

I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I II I I I I I I I I I I 

Db 940 T GT GAGGAT CT TT GT C CT C CTGGTAAAC AT G GT CC ACAGT GT GAGCAGAGAT GCC CTT GT * 999 

Qy 721 CAAAAT GGAGGAGT GT GTCATCACGT CACT GGAGAAT GCTCTT GCCCTTCT GGCTGGAT G 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1000 CAAAAT GGAGGAGT GT GT CATC ACGT CACT GGAGAAT GCT CT T GC C CTT CT GGCT GGAT G 1059 

Qy 7 81 GGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAA 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1060 GGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAA 1119 

Qy 841 TGCCAGTGCCATAATGGAGGGACGTGTGATGCTGCCACAGGCCAATGTCATTGCAGTCCA 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1120 T GC CAGT GC CAT AAT GGAGGGAC GT GT GAT GCT GC CAC AGGC CAAT GT CATT GC AGT C C A 1179 

Qy 901 GGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1180 GGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGT 1239 

Qy 961 GCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1240 GCT GAGAC CT G C CAGT GT GT CAACG GAG G GAAGT GT T AC CAC GT GAGC GGCGCAT GC CT C 1299 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 



1021 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 108 0 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I 

1300 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 1359 

1081 G GCAT CAAAT GT GAC AAAC GGT GT C C CT GC C ACT T GGAAAAC ACT C AT AGCT GT C AC C C C 114 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1360 GGCAT CAAAT GT GACAAACGGT GT C CCT GCCACTTGGAAAACACT CAT AGCT GTCACC CC 1419 

1141 AT GT CT G GAGAGT GT GCCT GCAAG C CGG GCT G GT CAGGACT CT ACT GTAAT GAGACAT GT 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1420 ATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGT 1479 

1201 TCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGAC 1260 

I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
14 8 0 TCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGAC 1539 

1261 TGTGACAGT GT GACT GGAAAGT GCACCT GT GCC CCAGGATT CAAAGGAATTGACT GCT CT 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1540 T GT GAC AGT GT GACT G GAAAGT GC AC CT GT GC C C CAGG AT T C AAAG GAAT T GACT G CT CT 1599 

1321 ACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAAT 1380 

I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I 

1600 .. ACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAAT 1659 

1381 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 

1660 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 1719 

1441 GACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAG 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

1720 GACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAG 177 9 

1501 TGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1780 TGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 1839 

1561 CGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAG 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I 

1840 C GC GGGGAGAAAT G C GAACTT C C CT GC C AGGAT GG CAC GTACGGGCT GAACT GT G CT GAG 1899 

1621 CGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTC 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1900 CGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTC 1959 

1681 CCGGGATGGTCAGG 1694 

II I I I I I I I I I I I 
1960 CCCGGATGGTCAGG 1973 



RESULT 5 
AR217554 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 



AR217554 
Sequence 
AR217554 
AR217554. 

Unknown. 



144 8 bp DNA 
9 from patent US 6416974. 

1 GI:23317353 



linear PAT 25-SEP-2002 



ORGANISM Unknown. 

Unclassified . 
REFERENCE 1 (bases 1 to 1448) 

AUTHORS Holtzman, D.A. and Goodearl , A. D.J. 
TITLE Tango 71 nucleic acids 

JOURNAL Patent: US 6416974-A 9 09-JUL-2002; 
FEATURES Location/Qualifiers 
source 1. .1448 

/organism="unknown" 
/mo l_type=" genomic DNA" 

ORIGIN 

Query Match 41.7%; Score 1425.8; DB 6; Length 1448; 

Best Local Similarity 99.9%; Pred. No. 0; 

Matches 1427; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

GGGGAAGCTT GCCAGCAGAT CT GCAGCT GCCAAAATGGGGC AGACT GT GACAGT GT GACT 1275 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I 

GGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGACTGTGACAGTGTGACT 7 7 

GGAAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCTACCCCATGCCCTCTG 1335 
I 1 I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 
GGAAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCTACCCCATGCCCTCTG 137 

GGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCT 1395 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

GGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCT 197 

CCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGA 1455 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGA 257 

TGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGGGA 1515 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

TGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGGGA 317 

GCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAATGC 1575 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAATGC 377 

GAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGC 1635 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGC 437 

CACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCGGGATGGTCAGGT 1695 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCCGGATGGTCAGGT 497 

GTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGC 1755 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGC 557 

TACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCACCAGGC 1815 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TACT GT AAAAAT G GGGCTT C AT GCTCCCCT GAT GAT G GC AT CT GCGAGT GT GC ACCAGG C 617 



Qy 


1216 


Db 


18 


Qy 


1276 


Db 


78 


Qy 


1336 


Db 


138 


Qy 


1396 


Db 


198 


Qy 


1456 


Db 


258 


Qy 


1516 


Db 


318 


Qy 


1576 


Db 


378 


Qy 


1636 


Db 


438 


Qy 


1696 


Db 


498 


Qy 


1756 


Db 


558 



Qy 



1816 TTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGC 1875 



618 TTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGC 677 

1876 CAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCACATCACCGGCCTGTGT 1935 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I !! I I I I I I I I I I I I I I I I I I I I I I I I I I I 

678 CAGACAT GCC CACAGT GC GT T CAC AGCAGC G GGC C CT GC C AC CAC AT C AC C GGC C T GT GT 737 

1936 GACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCAGATTT 1995 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

738 GACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCAGATTT 797 

1996 GGGAAAAACTGTGCAGGAATTTGTACCTGCACCAACAACGGAACCTGTAACCCCATTGAC 2055 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I 

798 GGGAAAAACT GT GCAGGAATTTGTACCT GCACCAACAACGGAAC CT GTAAC CC CATT GAC 857 

2056 AGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCTCAACCATGTCCACCT 2115 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i M 1 1 

8 58 AGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCTCAACCATGTCCACCT 917 

2116 GCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCTGCAGC 2175 
I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
918 GCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCTGCAGC 977 

2176 GCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTCTACTGCACTCAGAGA 2235 

I M I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

978 GCCT ACGAT GGGGAAT GTAAAT G C ACT C CT G GCT GGACAG GGCT C T ACT GCACT CAGAGA 1037 

2236 TGT CCT CTAGGGTTTTATGGAAAAGATTGT GCACT GAT AT GCCAATGT CAAAACGGAGCT 2295 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
1038 T GT C CT CT AGGGTTTT AT GGAAAAGATT GT GCACTGAT AT GCCAAT GT CAAAACGGAGCT 1097 

2296 GACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTCATGGGACGGCACTGT 2355 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1098 GACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTCATGGGACGGCACTGT 1157 

2356 GAGCAGAAGT GCCCTTCAGGAACAT AT GGCT AT GGCT GT CGCCAGATAT GT GATT GTCT G 2415 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1158 GAGCAGAAGT GCC CTT C AGGAAC AT AT GGCT AT GGCT GT C GCC AGAT AT GT GATT GT CT G 1217 

2416 AACAACT CCACCT GCGAC CACAT CACT GGGACCT GTT ACT GCAGC CCCGGAT GGAAGGGA 2475 

I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
1218 AACAACTCCACCTGCGACCACATCACTGGGACCTGTTACTGCAGCCCCGGATGGAAGGGA 1277 

2476 GC GAGAT GT GAT CAAG C T GGT GTT AT C AT AGTT GGAAAT CT GAACAGCTTAAGC C GAAC C 2535 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1278 GCGAGAT GTGAT CAAGCTGGT GTTAT CAT AGTT GGAAAT CT GAACAGCTTAAGCCGAACC 1337 

2536 AGTACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCAGGCATCATCATTCTT 2595 

I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

1338 AGT ACT GCT CT C C CT GCT GATTC CTAC CAAAT C G GGG C CAT T G CAG GCAT CAT CAT T CTT 1397 

2596 GTCCTAGTTGTTCTCTTCCTACTGGCATTGTTCATTATTTATAGACACA 264 4 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I M I I I I I I I I I I I I I I I 
1398 GTCCTAGTTGTTCTCTTCCTACTGGCATTGTTCATTATTTATAGACACA 144 6 



BD080296 

LOCUS BD080296 1448 bp DNA linear PAT 27-AUG-2002 

DEFINITION Tango-71, Tango-73, Tango-74, Tango-76, and Tango-83 nucleic acid 

molecules and polypeptides. 
ACCESSION BD08 0296 

VERSION BD080296.1 GI:22625899 

KEYWORDS JP 2001512681-A/5. 
SOURCE Homo sapiens (human) 

ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 (bases 1 to 1448) 

AUTHORS Holtzman, D.A. and Goodearl , A. D.J. 

TITLE Tango-71, Tango-73, Tango-74, Tango-76, and Tango-83 nucleic acid 

molecules and polypeptides 
JOURNAL Patent: JP 2001512681-A 5 28-AUG-2001; 
MILLENNIUM PHARMACEUTICALS INC 
COMMENT OS Homo sapiens (human) 

PN JP 2001512681-A/5 
PD 28-AUG-2001 
PF 06-AUG-1998 JP 2000506335 

PR 06-AUG-1997 US 60/054966, 05-SEP-1997 US 60/058108 PI 
DOUGLAS A HOLTZMAN, ANDREW D J GOODEARL 

PC C12Nl5/09,C07K14/47,C07K16/18,C12N5/10,C12P21/02,C12Ql/68, PC 
G01N33/15, 

PC G01N33/50,G01N33/53//C12P21/08,C12N15/00,C12N5/00 CC 
Tango-71, Tango-73, Tango-74, Tango-76, and Tango-83 nucleic CC 
acid 

CC molecules and polypeptides 

FH Key Location/Qualifiers 

FT source 1. .1448 

FT /organism= 1 Homo sapiens (human)'. 

Location/Qualifiers 
1. .1448 
/organism="Homo sapiens" 
/mol_type=" genomic DNA" 
/db_xref="taxon: 9606" 

ORIGIN 



FEATURES 

source 



Query Match 41.7%; 
Best Local Similarity 99.9%; 
Matches 1427; Conservative 



Score 1425.8; 
Pred. No. 0; 
0; Mismatches 



DB 6; Length 144 8; 



2; Indels 



0; Gaps 



0; 



Qy 1216 GGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGACTGTGACAGTGTGACT 1275 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 18 GGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGACTGTGACAGTGTGACT 77 



Qy 1276 GGAAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCTACCCCATGCCCTCTG 1335 

I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 7 8 GGAAAGTGCACCTGTGCCCCAGGATTCT^AAGGAATTGACTGCTCTACCCCATGCCCTCTG 137 

Qy 1336 GG7\ACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCT 1395 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 138 GGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCT 197 



Qy 



1396 CCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGA 1455 
M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 



Db 198 CCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGA 257 

Qy 1456 TGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGGGA 1515 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 1 I I I I M I 

Db 258 TGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGGGA 317 

Qy 1516 GCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAATGC 1575 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 318 GCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAATGC 377 

Qy 1576 GAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGC 1635 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 378 GAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGC 437 

Qy 1636 CACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCGGGATGGTCAGGT 1695 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I 

Db 438 CACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCCGGATGGTCAGGT 4 97 

Qy 1696 GTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGC 1755 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I 

Db 498 GTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGC 557 

Qy 1756 . TACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCACCAGGC 1815 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

Db 558 TACTGTAAAAAT GGGGCTT CAT GCT CCCCT GAT GAT GGCATCT GCGAGT GTGCACCAGGC 617 

Qy 1816 TTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGC 1875 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 618 TTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGC 677 

Qy 1876 CAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCACATCACCGGCCTGTGT 1935 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I II I I 
Db 678 CAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCACATCACCGGCCTGTGT 737 

Qy 1936 GACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCAGATTT 1995 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 738 GACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCAGATTT 797 

Qy 1996 GGGAAAAACT GT GCAGGAATTT GT AC CTGCAC CAACAAC G GAACCT GTAACC C CAT T GAC 2055 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 1 I I I I I I I I I I I I I I I I I i I I I I I I I I I I I 

Db 798 GGGAAAAACT GT GCAGGAAT TT GT AC CT GCAC CAACAAC GGAACCT GTAACC C CAT T GAC 857 

Qy 2056 AGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCTCAACCATGTCCACCT 2115 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 858 AGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCTCAACCATGTCCACCT 917 

Qy 2116 GCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCTGCAGC 2175 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 918 GCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCTGCAGC 977 

Qy 2176 GCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTCTACTGCACTCAGAGA 2235 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 978 GCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTCTACTGCACTCAGAGA 1037 



Qy 

Db 



2236 
1038 



T GTCCTCTAGGGTTTTAT GGAAAAGATT GT GC ACT GAT AT GCCAAT GTCAAAACGGAGCT 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
T GT C CT CT AGG GTTT T AT GGAAAAGATT GT GC ACT GAT AT G CCAAT GT CAAAACG GAG CT 



2295 
1097 



Qy 

Db 



2296 
1098 



GACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTCATGGGACGGCACTGT 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I M M I 

GACT GC GAC CAC ATT T CT GGGCAGT GTACT T G C C G C ACT GGAT T CAT GGGAC G GC AC T GT 



2355 
1157 



Qy 


2356 


GAGCAGAAGT GC CCT T CAGGAAC AT AT GG CT AT GGCT GT C GC CAGATAT GT GATT GT CT G 


2415 




1 1 1 1 1 1 1 1 I I I I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 




Db 


1158 


GAGCAGAAGTGCCCTTCAGGAACATATGGCI Al UCj^CALjAI ai bi ibitib 


1917 


Qy 


2416 


AACAACT C CACCT GC GACCACAT CACTGGGAC CT GTT AC T GC AGC C CC GGAT GGAAGGGA 


2475 




I I I 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 




Db 


1218 


AACAACTCCACCTGCGACCACATCACTGGGACCTGT 1 ACI GCAOCCCUCjbAI bbAAObtjA 


1 Oil 


Qy 


2476 


GCGAGATGTGATCAAGCTGGTGTTATCATAGTTGGAAA1 C 1 bAALAbt 1 1 AAbuubAAUU 


9 S^S 




| | | | | I I I I 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 II 1 1 1 1 1 1 




Db 


1278 


G C GAGAT GT GAT CAAGCT GGT GT TAT CAT AGT T GGAAAT CT GAAC AG CT T AAGCC GAAC C 


1337 


Qy 


2536 


AGTACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCAGGCATCATCATTCTT 


2595 




I I I 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1338 


AGTACTGCTCTCCCTGCTGATTCCTACCAAATCGGGGCCATTGCAGGCATCATCATTCTT 


1397 


Qy 


2596 


GTCCTAGTTGTTCTCTTCCTACTGGCATTGTTCATTATTTATAGACACA 2644 






1 1 1 1 1 1 1 1 1 1 1 II I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1398 


GTCCTAGTTGTTCTCTTCCTACTGGCATTGTTCATTATTTATAGACACA 1446 





RESULT 7 
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LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 



JOURNAL 
MEDLINE 
PUBMED 
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AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



AB058677 5702 bp mRNA linear PRI 10-MAY-2001 

Homo sapiens mRNA for MEGF11 protein (KIAA17 81) , complete cds . 
AB058677 

AB058677. 1 GI: 14 017778 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (sites) 

Nagase,T., Nakayama,M., Nakajima,D., Kikuno,R. andOhara,0. 

Prediction of the coding sequences of unidentified human genes. XX. 

The complete sequences of 100 new cDNA clones from brain which code 

for large proteins in vitro 

DNA Res. 8 (2), 85-95 (2001) 

21245130 

11347906 

2 (bases 1 to 5702) 

Nakayama,M., Nagase,T., Nakajima,D., Kikuno,R. andOhara,0. 
Direct Submission 

Submitted (27-MAR-2001 ) Manabu Nakayama, Kazusa DNA Rsearch 
Institute, Department of Human Gene Research; 1532-3, Yana, 
Kisarazu, Chiba 292-0812, Japan (E-mail : nmanabu@kazusa . or . jp, 
URL :http: //www. kazusa.or.jp/huge, Tel : 81-438-52-3915, 
Fax:81-438-52-3914) 

Location/Qualif iers 
1. .5702 

/organism="Homo sapiens" 
/mol_type="mRNA" 



gene 



/db_xref="taxon:9606" 
/clone="fg06971" . 
/tissue_type="brain" 
/dev_stage="fetal" 

/note="vector rpBluescript II SK plus" 
1. .5702 
/gene="MEGFll" 
CDS 160. .3069 

/gene="MEGFll" 
/note="KIAA1781 protein 

Start codon is not confirmed. fg06971 cDNA clone for 
KIAA1781 has a 119bp insertion after the position 2642 of 
the sequence of KIAA1781 

gene encoding protein with multiple EGF-like-domains" 
/codon_start=l 

/product="MEGFll protein (KIAA1781)" 
/protein_id="BAB47410 . 1" 
/db_xref="GI : 14017779" 

/translation="MHTPSIRSITHDAQTSSTGSSAPGTALCTEECVHGRCVSPDTCH 
CEPGWGGPDCSSGCDSDHWGPHCSNRCQCQNGALCNPITGACVCAAGFRGWRCEELCA 
PGTHGKGCQLPCQCRHGASCDPRAGECLCAPGYTGVYCEELCPPGSHGAHCELRCPCQ 
NGGTCHHITGECACPPGWTGAVCAQPCPPGTFGQNCSQDCPCHHGGQCDHVTGQCHCT 
. AGYMGDRCQEECPFGSFGFQCSQRCDCHNGGQCSPTTGACECEPGYKGPRCQERLCPE 
GLHGPGCTLPCPCDADNTISCHPVTGACTCQPGWSGHHCNESCPVGYYGDGCQLPCTC 
QNGADCHSITGGCTCAPGFMGEVCAVSCAAGTYGPNCSSICSCNNGGTCSPVDGSCTC 
KEGWQGLDCTLPCPSGTWGLNCNESCTCANGAACSPIDGSCSCTPGWLGDTCELPCPD 
GTFGLNCSEHCDCSHADGCDPVTGHCCCLAGWTGIRCDSTCPPGRWGPNCSVSCSCEN 
GGSCSPEDGSCECAPGFRGPLCQRICPPGFYGHGCAQPCPLCVHSSRPCHHISGICEC 
LPGFSGALCNQVCAGGYFGQDCAQLCSCANNGTCSPIDGSCQCFPGWIGKDCSQACPP 
GFWGPACFHACSCHNGASCSAEDGACHCTPGWTGLFCTQRCPAAFFGKDCGRVCQCQN 
GASCDHISGKCTCRTGFTGQHCEQRCAPGTFGYGCQQLCECMNNSTCDHVTGTCYCSP 
GFKGIRCDQAALMMEELNPYTKISPALGAERHSVGAVTGIMLLLFFIWLLGLFAWHR 
RRQKEKGRDLAPRVSYTPAMRMTSTDYSLSGACGMDRRQNTYIMDKGFKDYMKESVCS 
SSTCSLNSSENPYATIKDPPILTCKLPESSYVEMKSPVHMGSPYTDVPSLSTSNKNIY 
EVEPTVSWQEGCGHNSSYIQNAYDLPRNSHIPGHYDLLPVRQSPANGPSQDKQS" 

ORIGIN 

Query Match 35.1%; Score 1200.6; DB 9; Length 5702; 

Best Local Similarity 69.2%; Pred. No. 0; 

Matches 1670; Conservative 0; Mismatches 739; Indels 6; Gaps 2; 

Qy 318 CCCCCACTGTGCTGAT7WVTGTGTCCATGGTCGCTGTATTGCTCCAAACACCTGTCAGTG 377 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 234 CGCCCTGTGTACGGAGGAGTGTGTGCACGGCCGCTGCGTTTCCCCGGACACCTGCCACTG 293 

Qy 378 TGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGATGGTGATCACTGGGGTCC 437 

I I I I I II II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I II 

Db 294 CGAGCCTGGCTGGGGAGGGCCCGACTGCTCCAGCGGCTGCGACAGCGACCACTGGGGGCC 353 

Qy 438 CCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGCAACCCCATCACCGGGGC 497 

I I I I I M I II I I I I I I I I I I I I I I M M M Mill I I I I I I I I I I I II II 

Db 354 CCACT GCAGCAAC C GGT GCCAGT GC C AGAAC GGC GC C CT GT GTAAC C C CAT CACAG G C GC 413 

Qy 498 TTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGACCGCTGTGAGCAGGGCAC 557 

Ml | I I I II I I I I I I II I I I I I I I I I M I I I I I I I I I I I I I I I I I 
Db 414 CTGCGTGTGCGCCGCCGGCTTCCGTGGATGGCGCTGCGAGGAGCTCTGCGCGCCTGGCAC 473 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



558 CT AT GGT AAC GACT GT CAT CAGAGAT G CC AGT GC C AGAATGGAGC CAC CT G C GAC CAC GT 

I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I 

474 CCACGGCAAGGGATGCCAGCTGCCGTGCCAGTGCCGACACGGTGCCAGCTGCGACCCCCG 



618 



534 



678 



594 



738 



654 



798 



714 



858 



77.4 



918 



834 



978 



894 



CACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTCTGTGAGGATCTTTGTCC 

I I II II I I I I I II I I I I I II MINIM I II II I I I II I II II M 

CGCCGGCGAGTGCCTCTGCGCACCTGGCTACACCGGCGTCTACTGCGAGGAGCTGTGCCC 

T C CT G GT AAAC AT GGT C CAC AGT GT GAGC AGAGAT GC C CT T GT CAAAAT GGAGGAGT GT G 

II I II I I I I I I I I I I I II I II I I I I M II I I I I I I I I M I I II 

TCCTGGGAGCCATGGAGCTCACTGTGAGCTGCGCTGCCCCTGTCAGAATGGGGGCACCTG 

T CAT CAC GT CACT GGAGAAT GCT CT TGC C CTT CT GGCT GGAT GG GCACAGT GT GT GGT CA 

I I I I I I II I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I II 

CCACCACATCACTGGCGAGTGTGCCTGCCCCCCAGGCTGGACGGGAGCAGTGTGTGCCCA 

GCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAATGCCAGTGCCATAATGG 

M I I I I I I I 1 Mill I I I I I I I I I I II I I I I I I II I I I I 

GCCCTGCCCACCAGGGACATTTGGCCAGAACTGCAGCCAGGATTGTCCTTGCCACCATGG 

AGGGACGT GT GAT GCT GC CACAGGC CAAT GT CAT T GC AGT C CAG GATACAC AGGGGAAC G 

I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I Mill I 

AGGGCAGT GT GACCACGT GACT GGACAGT GCCACTGT ACAGCT.GGATACAT GGGGGACAG 

GTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGTGCTGAGACCTGCCAGTG 

I II I II I I I II II I I I I II II I Mill II I II II I III 

GTGCCAAGAGGAGTGCCCCTTCGGGTCCTTCGGCTTCCAGTGCTCACAGCGCTGTGACTG 



617 



533 



677 



593 



737 



653 



797 



713 



857 



773 



917 



833 



977 



893 



T GT CAAC GGAGGGAAGT GT T AC CAC GT GAGC G GC G C AT GCCT CT GT GAAGC AGG CT T T G C 1037 

II I II II I I II I II II I II II I II Mill I II I I 
CCACAATGGGGGGCAGTGTTCACCCACCACGGGTGCCTGCGAGTGTGAGCCTGGCTACAA 953 



1038 TGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAATGTGACAA 1097 

III I I I I I I I I II I II II I I II I M II I I I I II I 

954 GGGCCCACGCTGCCAGGAGCGACTGTGCCCGGAGGGCCTGCATGGCCCAGGCTGCACCCT 1013 

1098 ACGGTGTCCCTGCCACTTGGAAAACACTCATAGCTGTCACCCCATGTCTGGAGAGTGTGC 1157 

I II I I II I II II I I II I II I I I I I I I I I I II I II Ml I 

1014 GCCCTGCCCCTGTGACGCTGACAACACCATCAGCTGCCACCCAGTAACT GGAGCTTGTAC 1073 

1158 CTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGTTCTCCTGGATTCTACGG 1217 

I I I I I I II I II I II II II I I I II I II I II III II II I I I I I M 

1074 CTGCCAGCCAGGCTGGTCTGGTCACCACTGCAATGAATCCTGCCCTGTTGGCTACTATGG 1133 

1218 GGAAGCT T GC CAGCAGAT CT GCAG CT GC CAAAAT GGGGCAGACT GT GACAGT GT GACT G G 1277 

III I II I I II I II II II I M II II I I I II III II I I I I I I II 

1134 CGATGGCTGCCAGCTGCCTTGCACCTGTCAGAATGGCGCCGACTGCCACAGCATCACTGG 1193 

1278 AAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCTACCCCATGCCCTCTGGG 1337 

II I I I I I II I I I I I I II I III I II I I III I II 

1194 GGGCTGCACTTGTGCTCCGGGCTTCATGGGAGAGGTCTGTGCCGTTTCCTGTGCAGCAGG 1253 

1338 AACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCTCC 1397 

M I II I II I II I I II I I I I I I II I II I I I I I I II I I I II I I I 

1254 GACCTATGGCCCCAACTGCTCGTCCATCTGTAGCTGTAACAATGGTGGCACCTGCTCCCC 1313 

1398 TGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGATG 1457 



Db 

Qy 

Db 

Qy 

Db 
Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



1314 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

AGTAGATGGCTCCTGTACCTGCAAGGAAGGGTGGCAGGGCCTGGACTGCACCCTGCCATG 



1373 



1458 TCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGGGAGC 1517 

I I I I I II I I I I I I I I II I I II I I I I III II I I I I I I I I I I 

137 4 TCCCAGTGGGACGTGGGGCCTGAACTGCAACGAGAGCTGCACCTGTGCCAATGGGGCAGC 1433 

1518 CT GCAAC AC C CT G GAC GGGAC C T GCAC GT GT G C AC CT GGAT G G C G C GGGGAGAAAT GC GA 1577 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1434 CTGCAGCCCCATAGACGGCTCCTGCTCCTGCACTCCTGGCTGGCTGGGAGACACCTGTGA 1493 

157 8 ACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGCCA 1637 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1494 GCTGCCTTGCCCGGATGGCACATTTGGGCTGAACTGCAGTGAACACTGTGACTGCAGCCA 1553 

1638 CGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCGGGATGGTCAGGTGT 1697 

I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I 
1554 T G CT GAT GGAT GT GAC CC C GT CACAG GC CACT GCT GCTG C CT GGC C GGAT GGACAG GCAT 1613 

1698 CCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGCTA 1757 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

1614 CCGCTGTGACAGCACGTGTCCACCTGGCCGCTGGGGCCCCAACTGCTCTGTCTCCTGCAG 1673 

1758 CTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCACCAGGCTT 1817 

I I I I I I I I I I i II I I I I I II I II I I I I I I I I I I I I I I I I I I II I I I I I 

1674 CTGTGAGAATGGAGGCTCCTGCTCCCCAGAGGATGGGAGCTGCGAGTGTGCCCCTGGCTT 1733 

1818 CCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGCCA 1877 

I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

1734 CCGAGGACCCTTATGCCAGAGAATCTGCCCCCCTGGGTTCTATGGCCACGGCTGCGCCCA 17 93 

187 8 GACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCACATCACCGGCCTGTGTGA 1937 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1794 GCCATGCCCCCTCTGCGTGCACAGCAGCAGGCCCTGCCACCACATCAGCGGCATCTGTGA 1853 

1938 CTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCAGATTTGG 1997 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I III I I I I I 

1854 GTGCCTCCCAGGATTCTCTGGAGCTCTCTGCAACCAAGTGTGTGCTGGAGGATACTTTGG 1913 

1998 G AAAAAC T GT G C AG G AAT T T GT AC C T G C AC C AAC AAC G G AAC C T GT AAC C C CAT T GAC AG 2057 

II I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1914 GCAGGACTGTGCCCAGCTCTGCTCCTGTGCCAACAACGGGACCTGCAGCCCTATCGATGG 1973 

2058 ATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCTCAACCATGTCCACCTGC 2117 

II II Mill I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I 

1974 CTCCTGCCAGTGCTTTCCTGGATGGATTGGCAAGGACTGCTCACAGGCTTGCCCACCCGG 2033 

2118 CCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCTGCAGCGC 2177 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I II II II I I I I I I I I I 
2034 GTTCTGGGGCCCCGCCTGCTTCCACGCATGCAGCTGCCACAACGGGGCGAGCTGCAGCGC 2093 

2178 CTAC GATGGGGAAT GTAAAT GCACT CCT GGCT GGACAGGGCT CT ACT GCACT CAGAGAT G 2237 

I I I I I I I I II I I I I I I I I I I I I I I I II I I II I I I I I I I I I II I I I 
2094 CGAGGACGGGGCCTGCCACTGCACCCCTGGCTGGACTGGACTCTTCTGCACACAGCGCTG 2153 



2238 T C CT CT AG GGTT T TAT GGAAAAGATT GTGCACT GAT AT G C CAAT GT CAAAAC G GAGCT GA 
I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



2297 



Db 



2154 CCCAGCAGCATTTTTTGGGAAGGACTGTGGGCGCGTATGCCAGTGTCAGAATGGCGCCAG 2213 



Qy 


2298 


Db 


2214 


Qy 


2358 


Db 


2274 


Qy 


2418 


Db 


2334 


Qy 


2478 


Db 


2394 


Qy 


2538 


Db 


2451 


Qy 


.2598 


Db 


2511 


Qy 


2658 


Db 


2571 


Qy 


2715 


Db 


2631 



CTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTCATGGGACGGCACTGTGA 2357 

I I I I I I I I I I I III I I I I I I I I I I I I I I I I I II I III II I I I I I I 

CT GT GAC CAC AT C AGT G GCAAGT GCAC CT GC CGC ACAGGCT T C ACC GGGCAACACT GT GA 227 3 

GCAGAAGT GCCCT T CAGGAACAT AT GGCTAT GGCT GT C GC CAGAT AT GT GAT T GT CT GAA 2417 
I I I I I II I I I I I I I I I I I I II I I II I I I I I I I I I II I I I I I I I M 
GCAGAGAT GT GC C CCAGGAACCTT T GG CT AT GG GT GT CAGCAG CT AT GT GAGT GCAT GAA 2333 

CAACTCCACCTGCGACCACATCACTGGGACCTGTTACTGCAGCCCCGGATGGAAGGGAGC 2477 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I II III 

CAACTC CAC CT GT GAC CAT GT CAC C G GCAC CTGT T ACT GCAGC C CT GGCT T CAAAGGAAT 2393 

GAGAT GT GAT CAAG CT GGT GT TAT CAT AGT T GGAAAT CT GAAC AGCT T AAGC C GAAC C AG 2537 

I I I I II I I I I I I I I I I I II I I I I I I I I Mil I I I I 

CAGGT GT GAC CAAGCT GCC CT CAT GAT GGAGG AGCTGAAT CCCTACACCAAGATCAG 2450 

TACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCAGGCATCATCATTCTTGT 2597 

I I I I I III II II I I I I II I I I I I I I I I I I III I 

CCCAGCACTGGGTGCAGAGCGGCACTCGGTGGGTGCTGTCACAGGCATCATGCTCCTGTT 2510 

CCT AGT T GT T CT CTTC CT ACT G GCAT T GTT CAT TAT T T AT AGACACAAGC AGAAGGGAAA 2657 

I I Ml I I I I I I I III II I I I II I I I I II 

ATTCTTCATTGTGGTGCTGCTGGGCCTATTTGCCTGGCATCGGCGGCGGCAGAAAGAGAA 2570 

GGAAT CAAGCAT G CCAGCAGTTACCTACAC CCCT GCT ATGAGGGT CGT CAAT GCAGA 2714 

II I III I I II I I I I I I I I I I I I I I I I I I I II III 
GGGCCGAGACCT GGCT CCCCGT GTCT C CT ACACACCTGCCATGAGGATGACCAGCAC CGA 2630 

T TAT AC CAT T T C AGG 272 9 

II I I I I I II I 
CTACTCCCTCTCAGG 2645 
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AK122555 5278 bp mRNA linear ROD 15-MAR-2003 

Mus mus cuius mRNA for mKIAAl781 protein. 

AK122555 

AK122555. 1 GI: 289728 41 
FLI_CDNA. 

Mus mus cuius (house mouse) 
Mus mus cuius 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
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Location/Qualifiers 

1. .5278 

/organism="Mus musculus" 
/ mo 1_ t yp e = "mRNA" 
/db_xref="taxon: 10090" 
/clone="mbg06605" 
/tissue_type="brain" 
/dev_stage="adult" 
/note="vector: modified pBC SK+" 
1. .5278 

/gene= f, mKIAA1781" 
<1. .2766 
/gene="mKIAA1781" 

/note~"CDS is predicted by in silico analysis. Start codon 
.is not identified. " 
/ codon_start=l 
/ evidence=not_experimental 
/product="mKIAA17 81 protein" 
/protein_id="BAC65837 .1" 
/db_xref="GI: 28972842" 

/translation="CPPGSHGAHCELRCPCQNGGTCHHITGECACPPGWTGAVCAQPC 
PPGTFGQNCSQDCPCHHGGQCDHVTGQCHCTAGYMGDRCQEECPFGTFGFLCSQRCDC 
HNGGQCSPATGACECEPGYKGPSCQERLCPEGLHGPGCTLPCPCDTENTISCHPVTGA 
CTCQPGWSGHYCNESCPAGYYGNGCQLPCTCQNGADCHSITGSCTCAPGFMGEVCAVP 
CAAGTYGPNCSSVCSCSNGGTCSPVDGSCTCREGWQGLDCSLPCPSGTWGLNCNETCI 
CANGAACSPFDGSCACTPGWLGDSCELPCPDGTFGLNCSEHCDCSHADGCDPVTGHCC 
CLAGWTGIRCDSTCPPGRWGPNCSVSCSCENGGSCSPEDGSCECAPGFRGPLCQRICP 
PGFYGHGCAQPCPLCVHSRGPCHHISGICECLPGFSGALCNQVCAGGHFGQDCAQLCS 
CANNGTCSPIDGSCQCFPGWIGKDCSQACPSGFWGSACFHTCSCHNGASCSAEDGACH 
CTPGWTGLFCTQRCPSAFFGKDCGHICQCQNGASCDHITGKCTCRTGFSGRHCEQRCA 
PGTFGYGCQQLCECMNNATCDHVTGTCYCSPGFKGIRCDQAALMMDELNPYTKISPAL 
GAERHSVGAVTGIVLLLFLVWLLGLFAWRRRRQKEKGRDLAPRVSYTPAMRMTSTDY 
SLSDLSQSSSHAQCFSNASYHTLACGGPATSQASTLDRNSPTKLSNKSLDRDTAGWTP 
YSYVNVLDSHFQISALEARYPPEDFYIELRHLSRHAEPHSPGTCGMDRRQNTYIMDKG 
FKDYMKESVCSSSTCSLNSSENPYATIKDPPILTCKLPESSYVEMKSPVHLGSPYTDV 
PSLSTSNKNIYEVEPTVSWQEGRGHNSSYIQNPYDLPKNSHIPGHYDLLPVRQSPAH 
GPFQEKQ" 



ORIGIN 



Query Match 30.4%; 
Best Local Similarity 63.3%; 
Matches 1736; Conservative 



Score 1040.8; DB 10; 
Pred. No. 1.6e-310; 
0; Mismatches 952; Indels 



Length 5278; 

54; Gaps 



7; 



QY 



Db 



673 T GT C CT C CT GGT AAAC AT GGT C CACAGT GT GAG CAGAGAT GC C CTT GT CAAAAT GGAGGA 732 
I I I 1 I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
1 TGCCCCCCTGGGAGCCATGGAGCTCACTGTGAGCTGCGCTGCCCCTGCCAGAATGGAGGC 60 



Qy 



733 GTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGATGGGCACAGTGTGT 792 
I I I I II I I II I I I I I I I I I I M I I I I I I I I I I I I I I I I I I 



Db 61 ACCTGCCACCACATCACTGGCGAATGTGCCTGCCCTCCAGGCTGGACGGGAGCAGTGTGT 120 

Qy 793 GGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAATGCCAGTGCCAT 852 

I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I 

Db 121 GCCCAGCCCTGCCCTCCAGGGACCTTTGGCCAGAACTGTAGCCAGGACTGTCCCTGCCAC 180 

Qy 8 53 AAT GGAGGGACGT GTGAT GCT GCCACAGGCCAAT GT CATT GCAGT CCAGGATACACAGGG 912 

I I I I I I I II I I I I II II I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 CAT GGAGGCCAGT GTGACCAT GT GACT GGACAATGCCACTGTACAGCT GGATACATGGGG 240 

Qy 913 GAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGTGCTGAGACCTGC 972 

II I I I I I I I I I I I I I I I I I I I I I I II I I I I I III III 

Db 241 GACAGGTGTCAAGAAGAATGTCCCTTTGGAACGTTCGGTTTCCTGTGCTCTCAACGCTGT 300 

Qy 973 CAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTCTGTGAAGCAGGC 1032 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 GACT G C CACAAT GGAG GT CAAT GT T CAC CAG CCACAGGGGC CT GT GAGT GT GAGC CT GGC 360 

Qy 1033 TTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAATGT 1092 

I III I I I I I I I I I I I I I I I I I I I I I I I I Ml II 

Db 361 TACAAGGGCCCTAGCTGCCAGGAGCGGCTATGCCCTGAGGGCCTGCATGGCCCAGGCTGC 420 

Qy 1093 GACAAACGGTGTCCCTGCCACTTGGAAAACACTCATAGCTGTCACCCCATGTCTGGAGAG 1152 

I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 ACCTTGCCCTGCCCCTGTGACACCGAGAACACTATCAGCTGCCATCCAGTTACTGGAGCT 4 80 

Qy 1153 TGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGTTCTCCTGGATTC 1212 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 4 81 TGTACCTGCCAACCAGGCTGGTCTGGCCACTACTGCAATGAGTCCTGCCCCGCCGGCTAC 540 

Qy 1213 TACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGACTGTGACAGTGTG 1272 

MM I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I 

Db 541 TATGGCAACGGTTGCCAGCTACCCTGCACCTGCCAGAACGGTGCTGACTGCCACAGTATC 600 

Qy 1273 ACTGGAAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCTACCCCATGCCCT 1332 

I I I I I I I I I I I I I I I I I II I I I I I III I II I I I I I I II 
Db 601 ACCGGGAGCTGCACTTGTGCTCCAGGCTTCATGGGAGAGGTGTGTGCCGTCCCCTGTGCT 660 

Qy 1333 CTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGC 1392 

I I I II I I II I I I II II I I II I I II I II M I III I III 

Db 661 GCAGGGACCTATGGTCCCAACTGTTCATCTGTATGTAGCTGTAGCAACGGCGGCACCTGT 720 

Qy 1393 TCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATC 1452 

I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I 
Db 721 TCCCCAGTGGATGGCTCCTGCACCTGCCGAGAGGGATGGCAGGGCCTGGACTGCTCCCTG 7 80 

Qy 14 53 AGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGG 1512 

I I I II I I I I I I I I I I I I I I I I I I I I I I I II II I II I I 

Db 781 CCTTGTCCCAGTGGGACCTGGGGCCTGAACTGCAATGAGACTTGCATCTGTGCCAATGGA 840 

Qy 1513 GGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAA 1572 

I I I I I I II I I I I I I I II I I I I III I II I I I M I II I I I 
Db 841 GCTGCCTGCAGCCCCTTTGATGGGTCCTGTGCCTGCACCCCAGGCTGGCTGGGGGACTCC 900 



Qy 1573 TGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGC 1632 

II I I I I I I M I I I I III II I I I I I II I I I I I I I I II I I I I I I I I I I I I 
Db 901 TGTGAACTGCCCTGCCCGGACGGCACTTTTGGGCTGAACTGCAGTGAGCATTGCGACTGC 960 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1633 AGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCGGGATGGTCA 1692 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I 

961 AGCCATGCTGATGGCTGTGACCCTGTCACAGGCCACTGCTGCTGCCTGGCAGGATGGACA 1020 

1693 GGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCC 1752 

II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1021 GGCATCCGCTGTGATAGCACGTGTCCTCCAGGTCGCTGGGGCCCCAACTGTTCAGTGTCC 1080 

1753 T GCT ACT GT AAAAAT GGGG CT T CAT GCT C C C CT GAT GAT GG C AT CT GC GAGT GT G C AC C A 1812 

III I I I I I II II I III I I I I I I I I II II II I II I I I I I II I I I M 

1081 TGCAGCTGTGAGAACGGAGGTTCCTGCTCCCCGGAGGACGGGAGCTGCGAGTGTGCCCCT 114 0 

1813 GGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGC 1872 

I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
1141 GGCTTTCGAGGACCCTTATGTCAGAGAATCTGCCCACCAGGATTCTACGGCCATGGCTGC 1200 

1873 AGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCACATCACCGGCCTG 1932 

I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I 

1201 GCCCAGCCTTGTCCCCTCTGCGTGCACAGCAGGGGGCCCTGCCACCACATCAGTGGTATC 12 60 

1933 TGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCAGA 1992 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

1261 T GT GAGT G C CT G C CAGGAT T CT CT G GAGC CT T GT GC AAC CAAGT GT GT GC T G GAG GGC AC 1320 

1993 TTTGGGAAAAACTGTGCAGGAATTTGTACCTGCACCAACAACGGAACCTGTAACCCCATT 2052 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1321 TTCGGGCAGGACTGTGCCCAGCTCTGTTCCTGTGCCAACAATGGGACCTGCAGCCCCATC 1380 

2053 GACAGAT CT T GT CAGT GTTAC C C CGGTT G GAT T GG CAGT GACT GCT CT CAAC CAT GT C CA 2112 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
1381 GATGGCTCCTGTCAGTGCTTCCCTGGGTGGATTGGCAAGGACTGCTCACAGGCCTGCCCA 1440 

2113 CCTGCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCTGC 2172 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1441 TCTGGGTTCTGGGGCTCTGCCTGCTTCCACACATGCAGCTGCCACAACGGGGCGAGCTGC 1500 

2173 AGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTCTACTGCACTCAG 2232 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I 
1501 AGCGCCGAGGATGGGGCCTGCCACTGCACCCCTGGCTGGACTGGACTCTTCTGCACGCAG 1560 

2233 AGAT GT C CT CTAGGGTT T TAT G GAAAAGATT GT GCACT GAT ATGC CAAT GT CAAAACG GA 2292 

I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
1561 C GTT GC C CTT CAGC ATTTT T T G GGAAGGACT GT GGG C AC AT AT GC CAGT GT CAGAAT G GA 1620 

2293 GCTGACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTCATGGGACGGCAC 2352 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1621 GC CAGCT GT GAC CACAT C ACT G GGAAAT GCAC CT GT C GAACAGGCT T CT CT G GCC G CCAC 1680 

2353 T GTGAGCAGAAGTGCCCTTCAGGAACATATGGCTAT GGCTGT CGCCAGATAT GTGATT GT 2412 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
1681 T GTGAACAGAGAT GT GCC CCT G GAAC CT TTGGAT AT GGGT GT CAGCAGCTAT GT GAGT G C 174 0 

2413 CTGAACAACTCCACCTGCGACCACATCACTGGGACCTGTTACTGCAGCCCCGGATGGAAG 2472 

I I I I I I I I I I I II II I I I I I I I I II I I I I I I I I I I I I I I II I II I I II 
1741 ATGAACAATGCCACTTGTGACCACGTCACTGGTACCTGTTACTGTAGCCCGGGATTCAAA 1800 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 
Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 



2473 GGAGCGAGAT GT GAT CAAGCTGGT GTTATCATAGTT GGAAAT CT GAACAGCTTAAGCCGA 2532 

Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M 

1801 GGAAT C AGGT GT GAC CAAG CTGC C CT CAT GAT G GAT G AG C T G AAT C C C T AC AC C AAG 1857 



2533 



2592 



AC C AGT ACT GCT CT C C CT G CTGAT T C CTAC CAGAT C G GGG C CAT T GC AGGCAT CAT CATT 

I I I I I I I I I I I I I II II I I I II I I I I I I I I I I 

1858 ATCAGTCCAGCTCTGGGAGCAGAGCGGCACTCAGTGGGTGCTGTCACCGGCATCGTTCTC 1917 

2593 CTTGTCCTAGTTGTTCTCTTCCTACTGGCATTGTTCATTATTTATAGACACAAGCAGAAG 2652 

I I I I I I I I I I I I I II I I I I I I I I I I I 

1918 CTGTTGTTCCTGGTGGTGGTGCTGCTGGGCCTGTTTGCCTGGCGACGGAGGCGGCAGAAA 1977 

2653 GGAAAGGAAT CAAGCAT G CCAGCAGTTACCT ACACCC CTGCTAT GAGGGT CGT CAAT 2709 

I III III II III I II I I I I I I I I I I I I I I I I II 

1978 GAGAAAGGCCGT GACCT GGCTCCCCGAGTCTCCTACACCC CAGCCAT GAGGAT GAC CAGC 2037 

2710 GCAGATTATACCATTT CAGGAACCCTTCCT CACAGCAAT GGT GGAAAC GCTAAT 2763 

Mil II I I MM II I II I II I I I II 

2038 ACAGACTACT CT CTCT CAGATTT GTCT CAAAGTAGCAGCCAT GCCCAGT GCTTTTCCAAT 2 097 

27 64 AG C C AC T AC T T C AC CAAT C C C AGT T AC C AC AC GCT C AC C C AGT GT G C C AC AT C C C C T C AC 2823 

I I I I I II I III I MM MM II II 

2098 . GCCAGCTACCACACACTGGCGTGTGGGGGGCCTGCCACCAGCCAGGCCAGCACTCTGGAC 



2157 



2824 GT CAACAACAGGGACAGGAT GACT GT CAC G - AAGT CAAAAAACAAT CAACT GT T T GT GAA 28 82 

II II I II I I I I Ml I I I I I I I I I II 

2158 AGGAACAGC C C CAC CAAG CT CAGT AACAAGTC C CTT GAC AGAGACACAGCAGGCT GGAC C 2217 

2883 T CT T AAAAAT GT GAAC C CT GGGAAGA GAGGCCCTGTGGGGGACTG 2927 

I II I I I I I I I II I I II III I 

2218 CCCTACAGCTATGTGAACGTGTTAGACTCCCATTTCCAGATCAGTGCCCTGGAGGCCAGG 2277 

2928 CACTGGGACATTGCCGGCTGACTGGAAACATGGCGGCT ACCTCAAC 2 973 

I I I I I I I I I II I I I II II M 

2278 T AC C C GCC CGAGGACT T CT ACAT T GAACT T AGACAC CT C AGC C GCC AT GCT GAGCCACAC 2337 

2974 GAGCTCGGTGCTTTTGGACTTGACAGAAG C TAT AT G GGAAAAT C C TT A 3021 

I II I II I II I I I I II I I I I Mill III II I 

2338 T CACCAGGCACTTGT GGAATGGACAGACGT CAGAACACATACATTAT GGACAAAGGCTT C 2397 

3022 AAAGAC CT GGGAAAGAAT T CT GAAT AT AAT TCAAGT AACT GCT CC CT AAG CAGT T CT GAG 3081 

I I II I II I II II I I I M I I I I II I I I I I M I Ml 

2398 AAAGATT ACAT GAAAGAAT CT GT GT G CAGT T CT AGC ACT TGCTCCTT GAACAG CAGT GAA 2457 

3082 AAC C CAT AT GC C ACT AT T AAAGAC C C AC CT GT ACT T AT C C C GAAAAGCT C AGAGT GT G GT 3141 

I II II I I M II I I II II I II II I M Mill II II I I I M 
2458 AAC C CT TAC GC C ACAATTAAGGACC C AC C CAT C CT CAC CT GCAAGCTT C CAGAAAGCAGT 2517 

3142 TAT GT G G AGAT GAAAT C G C C G G CAC GAAGAGAT T C C C CAT AT G C AGAG AT CAAT AAC T C A 3201 

II I II I I I I I M I I M I I I I I I I I I I II I I II 

2518 TATGTAGAAAT GAAGT CACCT GT GCACTT GGGGTCTCC GTACACAGAT GTGCCAT C CTT G 2577 

3202 ACTT CAGCCAACAGGAAT GTCTATGAAGTTGAACCTACAGTGAGT GTT GTCCAAGGAGTA 3261 

I II I II I II I I I M M II I I II I I I II I I I II I I II I I I I I M 
257 8 T C GACAT CT AAT AAAAAT AT AT AT GAAGT T GAG CC CAC AGT CAGT GT G GT C CAAGAAGGC 2637 

3262 T T CAGCAATAAT GGGCGT CT CT CC C AGGATC CAT AT GAC CT C C CAAAGAACAGT C ACAT C 3321 



I 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I 1 1 1 1 

Db 2638 C GAGGT C ATAACT C CAGCT AT AT C CAGAAT C CAT AC GAC CT AC CTAAGAACAGC CAT ATT 2 697 

Qy 3322 C CT T GT CAT TAT GAC CT G CT G C CAGT C C GAGAC AGT T CAT C C 3363 

III I II I I II I I II I M I II II Ml II I II 
Db 2698 C CT GGT C ACT AT GACCT C CT C CC AGTAAGACAGAGC C CT GC C 2739 



RESULT 9 

HSM805375 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



gene 
CDS 



HSM805375 3281 bp mRNA linear PRI 12-JUL-2002 

Homo sapiens mRNA; cDNA DKFZp434L121 (from clone DKFZp434L121 ) . 
AL834326 

AL834326. 1 GI: 21739945 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 3281) 

Poustka,A., Wellenreuther, R. , Mewes,H.W., Weil,B. and Wiemann, S . 
Direct Submission 

.Submitted ( 09- JUL-2 002 ) 1, D-857 64 Neuherberg, GERMANY 
Clone from S. Wiemann, Molecular Genome Analysis, German Cancer 
Research Center (DKFZ); Email s.wiemann@dkfz-heidelberg.de; 
sequenced by DKFZ (German Cancer Research Center, 

Heidelberg/Germany) within the cDNA sequencing consortium of the 
German Genome Project. 

This clone (DKFZp434L121 ) is available at the RZPD in Berlin. 
Please contact the RZPD: Ressourcenzentrum, Heubnerweg 6, 14059 
Berlin-Charlottenburg, GERMANY; Email: clone@rzpd.de Further 
information about the clone and the sequencing project is available 
at http : / /mips .gsf.de/proj/ cDNA/ . 

Location/Qualifiers 

1. .3281 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="RZPD:DKFZp434L121" 
/db_xref ="taxon : 9606" 
/clone="DKFZp434L121" 
/tissue__type=" testis" 

/clone_lib="434 (synonym: htes3) . Vector pSportl; host 
DH10B; sites NotI + Sail" 
/dev_s t age= " adult " 
1. .3281 

/gene="DKFZp434L121" 
<1. .1883 

/gene="DKFZp434L121" 

/note="similarity to MEGF6 (Rattus norvegicus ) " 
/ codon_start=3 

/product="hypothetical protein" 
/protein_id="CAD38994 . 1" 
/db_xref="GI : 21739946" 
/db_xref="GOA:Q8ND91" 
/db__xref ="SPTREMBL : Q8ND91" 

/translation= "GRVGPGAVCAQPCPPGTFGQNCSQDCPCHHGGQCDHVTGQCHCT 
AGYMGDRCQEECPFGSFGFQCSQRCDCHNGGQCSPTTGACECEPGYKGPRCQERLCPE 



GLHGPGCTLPCPCDADNTISCHPVTGACTCQPGWSGHHCNESCPVGYYGDGCQLPCTC 
QNGADCHSITGGCTCAPGFMGEVCAVSCAAGTYGPNCSSICSCNNGGTCSPVDGSCTC 
KEGWQGLDCTLPCPSGTWGLNCNESCTCANGAACSPIDGSCSCTPGWLGDTCELPCPD 
GTFGLNCSEHCDCSHADGCDPVTGHCCCLiAGWTGIRCDSTCPPGRWGPNCSVSCSCEN 
GGSCSPEDGSCECAPGFRGPLCQRICPPGFYGHGCAQPCPLCVHSSRPCHHISGICEC 
LPGFSGALCNQVCAGGYFGQDCAQLCSCANNGTCSPIDGSCQCFPGWIGKDCSQACPP 
GFWGPACFHACSCHNGASCSAEDGACHCTPGWTGLFCTQRKPHLLASQPLRIPCCGLL 
ATVGIVQTSREGGMQAAPGLWPDSCPTRTEELCRGSSRPDWIQGIDKPKVLEGQGCK 
AAQQH FLGRTVGAYAS VRMAPAVTT S VAS APAAQAS P GNTVS RDVPQE PLAMGVS S YV 
SA" 

polyA_site 3232 

/gene="DKFZp434L121" 

ORIGIN 

Query Match 21.7%; Score 742; DB 9; Length 3281; 

Best Local Similarity 69.2%; Pred. No. 8.4e-218; 

Matches 1012; Conservative 0; Mismatches 450; Indels 0; Gaps 0; 

Qy 781 GGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAA 840 

II I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I 

Db 18 GGAGCAGTGTGTGCCCAGCCCTGCCCACCAGGGACATTTGGCCAGAACTGCAGCCAGGAT 77 

Qy 841 TGCCAGTGCCATAATGGAGGGACGTGTGATGCTGCCACAGGCCAATGTCATTGCAGTCCA 900 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 78 T GT C CT T GCCACCAT GGAGGGCAGT GT GAC CAC GT GACT G GAC AGT G C C ACT GT AC AGCT 137 

Qy 901 GGAT ACACAGGGGAAC GGT GC CAGGAT GAGT GT C C T GT T G GGAC CT AT GGC GTT CT C T GT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 138 GGATACATGGGGGACAGGTGCCAAGAGGAGTGCCCCTTCGGGTCCTTCGGCTTCCAGTGC 197 

Qy 961 GCT GAGACCT GCCAGT GT GT CAACGGAGGGAAGT GTTACCAC GT GAGCGGCGCAT GCCT C 102 0 

I II III III I I I I I I I I I I I I II II I I I I I I I I 
Db 198 TCACAGCGCTGTGACTGCCACAATGGGGGGCAGTGTTCACCCACCACGGGTGCCTGCGAG 257 

Qy 1021 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 1080 

I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 258 TGTGAGCCTGGCTACAAGGGCCCACGCTGCCAGGAGCGACTGTGCCCGGAGGGCCTGCAT 317 

Qy 1081 GGC AT CAAAT GTGACAAACGGT GT CCCT GC CACT T GGAAAAC ACT CAT AGCT GT CAC C C C 1140 

III II I I I I II I I I II I I I I I I I I I I I I I I I I I 

Db 318 GGCCCAGGCTGCACCCTGCCCTGCCCCTGTGACGCTGACAACACCATCAGCTGCCACCCA 377 

Qy 1141 ATGT CT GGAGAGT GT GCCTGCAAGCCGGGCT GGT CAGGACT CTACTGTAAT GAGACATGT 1200 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 378 GTAACTGGAGCTTGTACCTGCCAGCCAGGCTGGTCTGGTCACCACTGCAATGAATCCTGC 437 

Qy 1201 TCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGAC 1260 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 438 CCTGTTGGCTACTATGGCGATGGCTGCCAGCTGCCTTGCACCTGTCAGAATGGCGCCGAC 4 97 

Qy 1261 T GT GACAGT GT GACT GGAAAGT GCACCT GT GC CCCAGGATT CAAAGGAATT GACT GCT CT 1320 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I 

Db 498 TGCCACAGCATCACTGGGGGCTGCACTTGTGCTCCGGGCTTCATGGGAGAGGTCTGTGCC 557 

Qy 1321 ACCCCATGCCCTCTGGGAACCTATGGGATA7\ACTGTTCCTCTCGCTGTGGCTGTAAAAAT 1380 

III I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

Db 558 GTTTCCTGTGCAGCAGGGACCTATGGCCCCAACTGCTCGTCCATCTGTAGCTGTAACAAT 617 
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Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
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Qy 
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1381 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 144 0 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

618 GGTGGCACCTGCTCCCCAGTAGATGGCTCCTGTACCTGCAAGGAAGGGTGGCAGGGCCTG 677 

1441 GACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAG 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

678 GACTGCACCCTGCCATGTCCCAGTGGGACGTGGGGCCTGAACTGCAACGAGAGCTGCACC 737 

1501 TGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 1560 

II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I 

738 TGTGCCAATGGGGCAGCCTGCAGCCCCATAGACGGCTCCTGCTCCTGCACTCCTGGCTGG 7 97 

1561 CGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAG 1620 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 
CTGGGAGACACCTGTGAGCTGCCTTGCCCGGATGGCACATTTGGGCTGAACTGCAGTGAA 



798 



857 



1621 CGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTC 1680 

I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

858 CACTGTGACTGCAGCCATGCTGATGGATGTGACCCCGTCACAGGCCACTGCTGCTGCCTG 917 

1681 CCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAAC 174 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

918 GCCGGATGGACAGGCATCCGCTGTGACAGCACGTGTCCACCTGGCCGCTGGGGCCCCAAC 977 

1741 TGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGC 1800 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
978 TGCTCTGTCTCCTGCAGCTGTGAGAATGGAGGCTCCTGCTCCCCAGAGGATGGGAGCTGC 1037 

18 01 GAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTAT 1860 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
1038 GAGTGTGCCCCTGGCTTCCGAGGACCCTTATGCCAGAGAATCTGCCCCCCTGGGTTCTAT 1097 

1861 GGGCATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCAC 1920 

II III I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I 
1098 GGCCATGGCTGCGCCCAGCCATGCCCCCTCTGCGTGCACAGCAGCAGGCCCTGCCACCAC 1157 

1921 ATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGT 198 0 

I I I I I I I I I I I I I I III I II II III I II II I I I I I I I I I I I I I I I I 
1158 AT CAGC GGC AT CT GT GAGTGC CT C C CAGGAT TCT CT GGAGCT CT CTG CAAC CAAGT GT GT 1217 

1981 CCCAGTGGCAGATTTGGGAAAAACTGTGCAGGAATTTGTACCTGCACCAACAACGGAACC 2040 

I III I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I 

1218 GCTGGAGGATACTTTGGGCAGGACTGTGCCCAGCTCTGCTCCTGTGCCAACAACGGGACC 1277 

2041 T GT AAC C C CAT T GAC AG AT C T T GT C AGT GT T AC CCCGGTTG GAT T GG C AGT GAC T G C T GT 2100 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1278 TGCAGCCCTATCGATGGCTCCTGCCAGTGCTTTCCTGGATGGATTGGCAAGGACTGCTCA 1337 

2101 CAAC CAT GT C C AC C T GC C CACT GGGGCC CAAACT GC AT C C AC AC GT G C AACT GC C AT AAT 2160 

II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

1338 CAGGCTTGCCCACCCGGGTTCTGGGGCCCCGCCTGCTTCCACGCATGCAGCTGCCACAAC 1397 

2161 GGAGCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTC 2220 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
1398 GGGGCGAGCTGCAGCGCCGAGGACGGGGCCTGCCACTGCACCCCTGGCTGGACTGGACTC 1457 



Qy 2221 TACT G CACT C AGAGAT GT C CTC 2242 

I I I I I I I I I I I III 

Db 14 58 TTCTGCACACAGCGTAAGCCCC 1479 



RESULT 10 
AF444274 

LOCUS AF444274 4290 bp mRNA linear ROD 06-DEC-2001 

DEFINITION Mus musculus Jedi protein mRNA, complete cds . 
ACCESSION AF444274 

VERSION AF444274.1 GI : 17386052 

KEYWORDS 

SOURCE Mus musculus (house mouse) 

ORGANISM Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
REFERENCE 1 (bases 1 to 4290) 

AUTHORS Krivtsov, A.V. , Zinovyeva, M. V. , Hendrikx, J . , Visser, J . W.M. and 
Belyavsky,A. V. 

TITLE Jedi is a novel DSL and EGF-like repeat motif-containing protein 

expressed on non-differentiated hematopoietic cells 
JOURNAL Unpublished 
REFERENCE 2 (bases 1 to 4290) 

AUTHORS Krivtsov, A.V. , Zinovyeva, M. V. , Visser, J. W.M. and Belyavsky, A. V. 
TITLE Direct Submission 

JOURNAL Submitted ( 07-NOV-2 001 ) Stem Cell Biology, Lindsley F. Kimball 

Research Institute, New York Blood Center, 310 East 67 Street, New 
York, NY 10021, USA 
FEATURES Location/Qualifiers 
source 1. .4290 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="C57BL" 
/db_xref="taxon: 10090" 
/clone="E-7" 
/tissue__type="testis" 
CDS 138. .3242 

/ codon_start=l 
/product^" Jedi protein" 
/protein_id="AAL3 8 57 1 . 1 " 
/db_xref="GI : 17386053" 

/trans la tion="MPLCPLLLLALGLRLTGTLNSNDPNVCTFWESFTTTTKESHLRP 
FSLLPAESCHRPWEDPHTCAQPTWYRTVYRQWKMDSRPRLQCCRGYYESRGACVPL 
CAQECVHGRCVAPNQCQCAPGWRGGDCSSECAPGMWGPQCDKFCHCGNNSSCDPKSGA 
CFCPSGLQPPNCLQPCPAGHYGPACQFDCQCYGASCDPQDGACFCPPGRAGPSCNVPC 
SQGTDGFFCPRTYPCQNGGVPQGSQGSCSCPPGWMGVICSLPCPEGFHGPNCTQECRC 
HNGGLCDRFTGQCHCAPGYIGDRCQEECPVGRFGQDCAETCDCAPGARCFPANGACLC 
EHGFTGDRCTERLCPDGRYGLSCQEPCTCDPEHSLSCHPMHGECSCQPGWAGLHCNES 
CPQDTHGPGCQEHCLCLHGGLCLADSGLCRCAPGYTGPHCANLCPPDTYGINCSSRCS 
CENAI ACS PI DGTCI CKEGWQRGNCSVPCPLGTWGFNCNASCQCAHDGVCS PQTGACT 
CTPGWHGAHCQLPCPKGQFGEGCASVCDCDHSDGCDPVHGQCRCQAGWMGTRCHLPCP 
EGFWGANCSNTCTCKNGGTCVS ENGNCVCAPGFRGPSCQRPCPPGRYGKRCVQCKCNN 
NHSSCHPSDGTCSCLAGWTGPDCSEACPPGHWGLKCSQLCQCHHGGTCHPQDGSCICT 
PGWTGPNCLEGCPPRMFGVNCSQLCQCDLGEMCHPQTGACVCPPGHSGADCKMGSQES 
FTIMPTSPVTHNSLGAVIGIAVLGTLWALIALFIGYRQWQKGKEHEHLAVAYSTGRL 
DGSDYVMPDVSPSYSHYYSNPSYHTLSQCSPNPPPPNKVPGSQLFVSSQAPERPSRAH 
GRENHVTLPADWKHRREPHERGASHLDRSYSCSYSHRNGPGPFCHKGPISEEGLGASV 



misc_f eature 
mis cofeature 
misc_f eature 
misc_f eature 
ORIGIN 



MSLSSENPYATIRDLPSLPGEPRESGYVEMKGPPSVSPPRQSLHLRDRQQRQLQPQRD 
SGTYEQPSPLSHNEESLGSTPPLPPGLPPGQYDSPKNSHIPGHYDLPPVRHPPSPPSR 

RQDR" 
402. .524 

/note="Region: DSL domain" 
528. .2339 

/note="Region: contains 14 EGF-like repeat motifs" 
2406. .2465 

/note="Region: TM domain 11 
3087. .3158 
/note="Region : PEST" 



Query Match 19.0%; Score 649.4; DB 10; Length 4290; 

Best Local Similarity 57.2%; Pred. No. 5e-189; 

Matches 1241; Conservative 0; Mismatches 916; Indels 12; Gaps 
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74 



190 



134 



250 



3; 



133 



CTCTGAAT CTT GAAGAC CCTAATGT GTGTAGCCACT GGGAAAGCTACT CAGT GACT GT GC 

IMM I I I I I I I.I I I I I I I I I I I I I I I I I I I I I I I 

CACT CAACT C CAAT GAT C C CAAT GT CT GTACCT TT T GGGAAAGCT T CAC CAC GAC CACT A 249 

AAGAGT CAT AC C C ACAT CCCT T T GAT C AAAT T T ACT ACACGAGCT GC ACT GAC AT T C 190 

I I I I II I I I I I I I M I I I I I I I I I 

AGGAGTCCCACCTGCGCCCCTTCAGCCTGCTCCCAGCTGAGTCCTGCCACAGGCCCTGGG 309 



191 TAAACTGGTTTAAATGCACGCGGCACAGAGTCAGCTATCGGACAGCCTATCGACATGGGG 250 

|| | || | | I I I II I I I I I I I I I I I I I I I I I I 

310 AGGACCCCCACACCTGTGCCCAGCCTACGGTTGTCTACCGGACTGTGTACCGTCAGGTGG 369 

251 AGAAGACTATGTATAGGCGCAAGTCTCAGTGTTGTCCTGGATTTTATGAAAGCGGGGAAA 310 

| M I I I M I I I Ml M I I 

370 T GAAGAT GGACT C C CGC C CAC GCCT GC AGT GCT GT AG GGGT TACT AC GAGAGCAGAGG GG 429 

311 TGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCTCCA7\ACACCT 37 0 

Mill | I I I I I I I I I I I I I I J I I I I I I I I I I M I I I I I 

430 CCTGTGTCCCACTCTGTGCCCAGGAGTGTGTCCATGGTCGCTGTGTGGCTCCGAATCAGT 489 

371 GT CAGT GT GAGCCT GGCT GGGGAGGGAC CAACT GCT C CAGT GCCT GC GAT GGT GAT CACT 430 

I I I I I I I I I I I I I II I I I I I II II I I I I I I I III I I 

490 GCCAGTGTGCACCAGGCTGGCGGGGTGGCGACTGCTCCAGCGAGTGTGCCCCGGGAATGT 549 

431 GGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGCAACCCCATCA 490 

I I I I I I I I II M II I I I I I II I M I I I I I I 

550 GGG GAC CACAGT GT GACAAGTT CT GC CACT GT GGCAACAAC AGT T CCT GT GAT C CCAAGA 609 

491 CCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGACCGCTGTGAGC 550 

I I I I I I I I II I I I I I I I I I I M I II II 

610 GTGGGGCGTGCTTTTGCCCCTCTGGCCTGCAGCCCCCCAACTGCCTTCAGCCTTGCCCTG 669 

551 AGGGCACCT ATGGTAACGACT GT CAT CAGAGAT GCCAGT GCCAGAAT GGAGCCACCT GCG 610 

M I I I II I I I I I I I I I I I I I I I I I I I I I I I I Mill 

670 CCGGCCACTATGGTCCTGCCTGCCAGTTTGATTGCCAGTGC TATGGGGCATCCTGTG 726 

611 ACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTCTGTGAGGATC 67 0 

| | | | Ml || I I I I I I I I II I I I I I I I I I I I I I I 

727 ACCCCCAGGATGGAGCCTGTTTCTGCCCTCCAGGGAGAGCAGGACCCAGCTGTAATGTGC 786 



Qy 


671 


TTT GT CCT CCT GGT AAACAT GGT CCACAGTGT GAGCAGAGAT GC C CTT GT CAAAAT GGAG 

1 1 1 1 1 III II 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 

CCTGTTCACAGGGCACTGATGGCTTCTTCTGCCCCAGAACCTATCCTTGCCAAAATGGAG 


730 


Db 


787 


846 


Qy 


731 


GAGT GT GT CAT C AC GT CACT G GAGAAT GCTCTTGCCCTTCTGGCT GG AT GGGCACAGT GT 

III Ml 1 1 1 II 1 1 1 1 1 1 1 1 M 

GTGTTCCTCAGGGCTCCCAAGGCTCCTGCAGCTGCCCACCGGGCTGGATGGGTGTCATTT 


790 


Db 


847 


906 


Qy 


791 


GTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAATGCCAGTGCC 

II I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 M 1 1 1 1 1 M 1 1 1 N 

GTTCCCTGCCATGCCCAGAGGGTTTCCATGGACCCAACTGTACTCAGGAATGTCGCTGCC 


850 


Db- 


907 


966 


Qy 


851 


AT AAT G G AGG GAC GT GT GAT GCT GC C AC AGG C CAAT GT CAT T GC AGT C C AG GAT AC AC AG 

I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Ml 

ACAACGGTGGCCTCTGTGACAGGTTTACTGGGCAGTGCCACTGTGCTCCTGGCTATATCG 


910 


Db 


967 


1026 


Qy 


911 


GGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGTGCTGAGACCT 

MM 1 II II II 1 M 1 1 II 1 II II II II M 1 1 1 M M 1 1 M II 1 1 

GGGATCGGTGCCAAGAAGAGTGCCCCGTGGGCCGCTTTGGTCAAGACTGTGCTGAGACCT 


970 


Db 


1027 


1086 


Qy 


971 


GCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTCTGTGAAGCAG 

1 I I I II II 1 II 1 1 1 1 1 1 II 1 1 1 1 1 M 1 M 1 

GTGACTGTGCTCCTGGCGCCCGTTGCTTTCCTGCTAATGGCGCGTGTCTGTGCGAACATG 


1030 


Db 


1087 


1146 


Qy 


1031 


GCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAAT 

I I II 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M II 1 

GCTTCACAGGCGACCGCTGCACTGAGCGCCTCTGTCCGGATGGCCGCTATGGTCTGAGCT 


1090 


Db 


1147 


1206 


Qy 


1091 


GT GACAAAC GGTGTCCCT GC CACTT GGAAAAC ACT CAT AGCT GT C AC C CCATGT CT GGAG 

I | II II II 1 1 1 1 1 II 1 II 1 1 1 1 1 M 1 1 1 1 1 1 1 M III 

GCCAGGAGCCCTGCACCTGCGACCCAGAACACAGTCTCAGCTGCCACCCGATGCACGGCG 


1150 


Db 


1207 


1266 


Qy 


1151 


AGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGTTCTCCTGGAT 

1 1 II II II 1 1 M 1 1 1 M 1 1 II 1 M 1 1 II 1 1 1 1 M II III 1 

AGTGCTCCTGCCAGCCAGGTTGGGCGGGCCTCCACTGCAACGAGAGCTGCCCTCAGGACA 


1210 


Db 


1267 


1326 


Qy 


1211 


T CT ACGGGGAAGCTT GCCAGCAGAT CT GCAGCT GCCAAAAT GGGGCAGACT GT GACAGT G 

III 1 i 1 II 1 1 1 1 MM Ml 1 1 1 1 1 Ml 1 

CGCATGGCCCGGGCTGCCAGGAGCACTGCCTCTGTCTGCACGGAGGGCTCTGCCTTGCCG 


1270 


Db 


1327 


1386 


Qy 


1271 


T GACT GGAAAGT GCACCT GT GCC C CAGGAT T CAAAGGAAT T GACT GCT CT ACC C CAT GC C 

1 II 1 II 1 II 1 1 1 III 1 1 II II 1 M II II 1 1 

ACAGCGGCCTCTGTCGGTGCGCGCCAGGATACACGGGACCTCACTGCGCTAACCTATGTC 


1330 


Db 


1387 


1446 


Qy 


1331 


CTCTGGGAACCTATGGGATAT^ACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCT 

I I I I | 1 1 II II II 1 II II II 1 1 II 1 M M 1 1 1 II II II II Ml 

CACCGGACACTTATGGGATCAACTGTTCCTCCCGCTGCTCCTGTGAAAATGCCATTGCCT 


1390 


Db 


1447 


1506 


Qy 


1391 


GCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCA 

M II 1 II 1 1 1 1 II 1 1 1 1 M 1 II 1 1 1 1 1 1 1 II 1 M 1 M II II 

GCTCTCCCATTGACGGCACGTGCATCTGCAAGGAAGGTTGGCAGCGTGGTAACTGCTCTG 


1450 


Db 


1507 


1566 


Qy 


1451 


TCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACG 

I M 1 II 1 1 II 1 1 1 M 1 M 1 II MM 1 1 II II 1 M 1 1 1 1 

TTCCCTGTCCCCTTGGCACCTGGGGCTTCAATTGCAATGCCAGTTGCCAGTGTGCCCACG 


'1510 


Db 


1567 


1626 


Qy 


1511 


GGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGA 


1570 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 
Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I III 

1627 ATGGAGTCTGCAGCCCCCAAACTGGAGCCTGTACTTGCACCCCTGGGTGGCATGGTGCTC 



1686 



1571 AATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACT 1630 

I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I 

1687 ACTGCCAGCTTCCCTGCCCGAAGGGACAGTTTGGTGAAGGCTGTGCCAGTGTCTGTGACT 174 6 

1631 GCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCGGGATGGT 1690 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1747 GTGACCACTCTGATGGCTGTGACCCTGTTCATGGACAGTGCCGATGTCAGGCTGGTTGGA 1806 

1691 CAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGC 1750 

II I I I I II II I I I I I I I I I I I I I I I II I I 

1807 TGGGCACACGCTGCCACCTGCCTTGCCCGGAGGGCTTTTGGGGAGCCAACTGCAGTAACA 1866 

1751 CCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCAC 1810 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1867 CCTGTACCTGCAAGAATGGTGGTACCTGTGTGTCTGAGAATGGCAACTGCGTGTGCGCAC 1926 

1811 CAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCT 1870 
I I I I I I I I I I I I I I I I i I I I I I I I I I I I I II I I I I I I I I I I I I I I 

1927 CAGGGTTCCGAGGCCCCTCCTGCCAGAGGCCCTGCCCGCCTGGTCGCTATGGCAAACGCT 1986 

1871 GCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCACATCACCGGCC 1930 

I II I I I I I I I I I II I I I I I I III 

1987 GTGTGCA AT GCAAGT GT AACAACAAC CAT T CT T C CT GC CAC C CAT C GGAC GGGA 2040 

1931 TGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCA 1990 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

2041 CCTGCTCCTGCCTGGCGGGCTGGACAGGCCCTGACTGCTCTGAGGCATGTCCCCCAGGCC 2100 

1991 GATTTGGGAAAAACTGTGCAGGAATTTGTACCTGCACCAACAACGGAACCTGTAACCCCA 2050 

I II I I I I I I I I I II I I I I I I I I I I I I I 

2101 ACTGGGGACTCAAATGCTCCCAACTCTGCCAGTGTCATCATGGTGGGACCTGCCACCCCC 2160 

2 051 T T GACAGAT CTT GT C AGT GT T AC C C C GGT T GGAT T GG C AGT GACT GCT CT CAAC C AT GT C 2110 

II I III II I I I I I I I I I I I I I I I I I II III 

2161 AGGATGGGAGCTGTATCTGCACGCCAGGCTGGACTGGACCCAACTGCTTGGAAGGCTGCC 2220 

2111 CACCTGCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCT 2170 

I I I I I II I I I I I I III II III II I I I I II 
2221 CAC C AAGAAT GTTTGGTGT CAAC T G C T C C C AG CT AT GT C AGT GT GAT C T C G GAG AGAT GT 228 0 

2171 GCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTCTACTGCACTC 2230 

II II I I I I I I III II I I I I I I II I I I I I I 

2281 GCCACCCACAGACTGGGGCTTGTGTCTGTCCCCCAGGACACAGTGGTGCAGACTGCAAAA 2 340 

2231 AGAGATGTC 2239 

I I I I I 
2341 TGGGAAGCC 2349 
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LOCUS AF461685 4539 bp mRNA linear 

DEFINITION Mus musculus Jedi-736 protein mRNA, complete cds . 
ACCESSION AF461685 



ROD 21-JAN-2002 



VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



CDS 



AF461685.1 GI:18252657 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus, 

1 (bases 1 to 4539) 

Krivtsov, A. V. , Zinovyeva, M. V. , Hendrikx, J. , Visser , J. W.M. and 
Belyavsky, A. V. 

Jedi is a novel DSL and EGF-like repeat motif-containing protein 
expressed on non-differentiated hematopoietic cells 
Unpublished 

2 (bases 1 to 4539) 

Krivtsov, A. V. , Zinovyeva, M.V. , Visser, J. W.M. and Belyavsky, A. V. 
Direct Submission 

Submitted (20-DEC-2001 ) Stem Cell Biology, Lindsley F. Kimball 
Research Institute, New York Blood Center, 310 East 67th Street, 
New York, NY 10021, USA 

Location/Qualif iers 

1. .4539 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="C57BL" 
/db_xref="taxon: 10090" 
/clone="F-4" 
/tissue_type=" testis" 
71. .2314 

/note="soluble form" 

/codon_start=l 

/product^" Jedi-736 protein" 

/protein_id="AAL66380. 1" 

/db_xref="GI : 18252658" 

/translation="MPLCPLLLLALGLRLTGTLNSNDPNVCTFWESFTTTTKESHLRP 
FSLLPAESCHRPWEDPHTCAQPTWYRTVYRQWKMDSRPRLQCCRGYYESRGACVPL 
CAQECVHGRCVAPNQCQCAPGWRGGDCSSECAPGMWGPQCDKFCHCGNNSSCDPKSGA 
CFCPSGLQPPNCLQPCPAGHYGPACQFDCQCYGASCDPQDGACFCPPGRAGPSCNVPC 
SQGTDGFFCPRTYPCQNGGVPQGSQGSCSCPPGWMGVICSLPCPEGFHGPNCTQECRC 
HNGGLCDRFTGQCHCAPGYIGDRCQEECPVGRFGQDCAETCDCAPGARCFPANGACLC 
EHGFTGDRCTERLCPDGRYGLSCQEPCTCDPEHSLSCHPMHGECSCQPGWAGLHCNES 
CPQDTHGPGCQEHCLCLHGGLCLADSGLCRCAPGYTGPHCANLCPPDTYGINCSSRCS 
CENAI ACS P I DGTCI CKEGWQRGNCSVPCPLGTWGFNCNASCQCAHDGVCS PQTGACT 
CTPGWHGAHCQLPCPKGQFGEGCASVCDCDHSDGCDPVHGQCRCQAGWMGTRCHLPCP 
EGFWGANCSNTCTCKNGGTCVSENGNCVCAPGFRGPSCQRPCPPGRYGKRCVQCKCNN 
NHSSCHPSDGTCSCLAGWTGPDCSEACPPGHWGLKCSQLCQCHHGGTCHPQDGSCICT 
PGWTGPNCLEGCPPRMFGVNCSQLCQCDLGEMCHPQTGACVCPPGHSGADCKMGESFA 
PLTLVFL" 



ORIGIN 



Query Match 18.9%; Score 648.6; DB 10; Length 4539; 

Best Local Similarity 57.3%; Pred. No. 8.9e-189; 

Matches 1236; Conservative 0; Mismatches 909; Indels 12; 



Gaps 



3; 



Qy 

Db 



74 CTCTGAATCTTGAAGACCCTAATGTGTGTAGCCACTGGGAAAGCTACTCAGTGACTGTGC 133 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I III 

123 C ACT CAACT C CAAT GAT CC CAAT GT CT GTAC CT T T TG GGAAAGCT T CACC AC GAC CACTA 182 



Qy 134 AAGAGTCATACCCACAT CC CTTTGAT CAAAT TT ACT AC AC GAGCT GCACT GACATT C 190 



I I I I I I I I ! I I I I I I I I I 

Db 183 AGGAGTCCCACCTGCGCCCCTTCAGCCTGCTCCCAGCTGAGTCCTGCCACAGGCCCTGGG 242 

Qy 191 TAAACTGGTTTAAATGCACGCGGCACAGAGTCAGCTATCGGACAGCCTATCGACATGGGG 250 

II I I I I I I I I I I I I I II I I I I I I I I I I I M 

Db 243 AGGACCCCCACACCTGTGCCCAGCCTACGGTTGTCTACCGGACTGTGTACCGTCAGGTGG 302 

Qy 251 AGAAGACT AT GT ATAGGCGCAAGT CT C AGT GTT GT C CT GGATTT T AT GAAAGC GGGGAAA 310 

I I II I I II I I I I I I I I I I II I I I 

Db 303 TGAAGATGGACTCCCGCCCACGCCTGCAGTGCTGTAGGGGTTACTACGAGAGCAGAGGGG 362 

Qy 311 TGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCTCCAAACACCT 370 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I 

Db 363 CCTGTGTCCCACTCTGTGCCCAGGAGTGTGTCCATGGTCGCTGTGTGGCTCCGAATCAGT 422 

Qy 371 GTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGATGGTGATCACT 430 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml I I 

Db 423 GCCAGTGTGCACCAGGCTGGCGGGGTGGCGACTGCTCCAGCGAGTGTGCCCCGGGAATGT 482 

Qy 431 GGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGCAACCCCATCA 490 

I I I I II I I I I II I I I I I II II I II I I I I I I 

Db 483 G GG GAC CACAGT GT GACAAGT T CT GC CACT GT GGCAACAACAGT TC CT GT GAT C CCAAGA 542 

Qy 491 CCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGACCGCTGTGAGC 550 

I I I I I I I I II I I I I I I I I I I I I I M M 

Db 54 3 GTGGGGCGTGCTTTTGCCCCTCTGGCCTGCAGCCCCCCAACTGCCTTCAGCCTTGCCCTG 602 

Qy 551 AGGGCACCT ATGGT AACGACT GT CAT CAGAGAT GCCAGT GC CAGAAT GGAGCCACCT GCG 610 

Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 603 CCGGCCACTATGGTCCTGCCTGCCAGTTTGATTGCCAGTGC TATGGGGCATCCTGTG 659 

Qy 611 ACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTCTGTGAGGATC 670 

I I I I III II I I I I I I I I I I I I I I I II I I I I I I I 

Db 660 ACCCCCAGGATGGAGCCTGTTTCTGCCCTCCAGGGAGAGCAGGACCCAGCTGTAATGTGC 719 

Qy 671 TTTGTCCTCCT GGTAAACAT GGT CCACAGT GT GAGCAGAGAT GC CCTTGT CAAAAT GGAG 730 

I I I I II I I - I I I I I I I I I I I I I I I I I 

Db 72 0 C CT GT T CACAGG G CACT GAT GGCTTCTTCT GCC CCAGAAC CT AT CC T T GC CAAAAT GGAG 77 9 

Qy 731 GAGT GT GT CAT CACGT CACT GGAGAAT GCT CTT GCCCTT CT GGCT GGATGGGCACAGT GT 790 

III III II II I I I I I I I I I I I I I I II 

Db 780 GTGTTCCTCAGGGCTCCCAAGGCTCCTGCAGCTGCCCACCGGGCTGGATGGGTGTCATTT 839 

Qy 791 GTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAATGCCAGTGCC 850 

II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 84 0 GTTCCCTGCCATGCCCAGAGGGTTTCCATGGACCCAACTGTACTCAGGAATGTCGCTGCC 8 99 

Qy 851 AT AAT G GAG GGAC GT GT GAT GCT GC C AC AG G C C AAT GT CAT T GCAGT C C AGGAT AC AC AG 910 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 900 ACAACGGTGGCCTCTGTGACAGGTTTACTGGGCAGTGCCACTGTGCTCCTGGCTATATCG 959 

Qy 911 GGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGTGCTGAGACCT 970 

I I I I I I II I I I I II I I I II I I II M M I I I I I I I I I I I I I I I I I 

Db 960 GGGATCGGTGCCAAGAAGAGTGCCCCGTGGGCCGCTTTGGTCAAGACTGTGCTGAGACCT 1019 

Qy 971 GC CAGT GT GT CAACG GAGGGAAGT GT T AC CAC GT GAGC G GCG CAT GCCT CT GT GAAGCAG 1030 
I I I I I I III I I I I I I I I I I I I I I I I 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



102 0 GTGACTGTGCTCCTGGCGCCCGTTGCTTTCCTGCTAATGGCGCGTGTCTGTGCGAACATG 107 9 

1031 GCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAAT 1090 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I M I 

108 0 GCTTCACAGGCGACCGCTGCACTGAGCGCCTCTGTCCGGATGGCCGCTATGGTCTGAGCT 1139 

1091 GT GAC AAAC G GT GT C C CT GC CACT T GGAAAAC ACT CAT AG CT GT C AC C C C AT GT CT GGAG 1150 

I I II II I I I II I I I I I I I I I I I I I I I I I I I I I I I Ml 
1140 GC CAGGAGC CCT GCAC CT GCGAC CC AGAACAC AGT CT CAGCT GCC AC C C GAT G CAC GGC G 1199 

1151 AGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGTTCTCCTGGAT 1210 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II Ml I 
1200 AGTGCTCCTGCCAGCCAGGTTGGGCGGGCCTCCACTGCAACGAGAGCTGCCCTCAGGACA 12 59 

1211 T CTACGGGGAAGCTT GCCAGCAGAT CT GCAGCT GCCAAAAT GGGGCAGACT GT GACAGT G 1270 

I I I I I I I I I I I I I I I I I I I I I I I I III I 

1260 CGCATGGCCCGGGCTGCCAGGAGCACTGCCTCTGTCTGCACGGAGGGCTCTGCCTTGCCG 1319 

1271 T GACT GGAAAGT G CAC CTGTGCCC C AGGAT T C AAAG GAAT T GACT GCT CT AC C C CAT GCC 1330 

I II II I I I I I I I I I I I I I III I I I I I I I I I II I I I I 

1320 ACAGC GGCC T CT GT C G GT G CGC GC C AGGAT ACAC G G GAC CTC ACT GC GCTAAC CTAT GT C 1379 

1331- CTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCT 13 90 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

138 0 CACCGGACACTTATGGGATCAACTGTTCCTCCCGCTGCTCCTGTGAAAATGCCATTGCCT 1439 

1391 GCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCA 1450 

I I II I II I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I 

144 0 GCTCTCCCATTGACGGCACGTGCATCTGCAAGGAAGGTTGGCAGCGTGGTAACTGCTCTG 14 9 9 

1451 TCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACG 1510 

I I I II I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I 

1500 TTCCCTGTCCCCTTGGCACCTGGGGCTTCAATTGCAATGCCAGTTGCCAGTGTGCCCACG 1559 

1511 GGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGA 1570 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I III 

1560 ATGGAGTCTGCAGCCCCCAAACTGGAGCCTGTACTTGCACCCCTGGGTGGCATGGTGCTC 1619 

1571 AATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACT 1630 

I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I 

1620 ACTGCCAGCTTCCCTGCCCGAAGGGACAGTTTGGTGAAGGCTGTGCCAGTGTCTGTGACT 1679 

1631 GCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCGGGATGGT 1690 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II 

168 0 GT GAC CACT CT GAT GGCT GT GAC CCT GT T CAT GGAC AGT G CC GAT GT CAGGCT GGT T GGA 1739 

1691 CAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGC 1750 

II I I I I II II I I I I I I I I I II I I I I I I I I 

1740 TGGGCACACGCTGCCACCTGCCTTGCCCGGAGGGCTTTTGGGGAGCCAACTGCAGTAACA 1799 

1751 CCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCAC 1810 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

1800 CCTGTACCTGCAAGAATGGTGGTACCTGTGTGTCTGAGAATGGCAACTGCGTGTGCGCAC 1859 

1811 CAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCT 187 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

18 60 CAGGGTTCCGAGGCCCCTCCTGCCAGAGGCCCTGCCCGCCTGGTCGCTATGGCAAACGCT 1919 



Qy 1871 GCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCACATCACCGGCC 1930 

1 M I I I I II I I I I I I I I I I I III 

Db 1920 GTGTGCA AT GCAAGT GT AAC AACAAC C ATT CT T C CT GC C AC C CAT C GGAC G GGA 1973 

Qy 1931 TGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCA 1990 

M I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I III 
Db 1974 CCTGCTCCTGCCTGGCGGGCTGGACAGGCCCTGACTGCTCTGAGGCATGTCCCCCAGGCC 2033 

Qy 1991 GAT T T G G G AAAAAC T GT G C AG G AAT T T GT AC C T G C AC C AAC AAC G G AAC C T GT AAC C C C A 2050 

I M I I I I I I I I I II I I I I I I I I I I I I I 

Db 2 034 ACTGGGGACTCAAATGCTCCCAACTCTGCCAGTGTCATCATGGTGGGACCTGCCACCCCC 2 093 

Qy 2 051 TT GACAGAT CT T GT CAGT GT T ACC CCGGTT GGAT T GG CAGT GACT GC T CT CAAC CAT GT C 2110 

M I III II I I I I I I I I I I I MINI II III 

Db 2094 AGGATGGGAGCTGTATCTGCACGCCAGGCTGGACTGGACCCAACTGCTTGGAAGGCTGCC 2153 

Qy 2111 CACCTGCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCT 2170 

I I I I I II I I I I I I I II II III II I I I I II 

Db 2154 CACCAAGAATGTTT GGT GTCAACT GCT CCCAGCTATGT CAGT GTGATCT CGGAGAGATGT 2213 

Qy 2171 GCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTCTACTGCA 2227 

II III I I I I I II I II I I I I I I II I II I I I 

Db 2214 GCCAC C C ACAGACT GGGG C TT GT GT CT GT C C C C CAGGACACAGT G GT GCAGACT GC A 227 0 



RESULT 12 

AF440279 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



gene 



AF440279 4482 bp mRNA linear ROD 20-NOV-2001 

Mus musculus MEGF12 (Megfl2) mRNA, complete cds . 

AF440279 

AF440279. 1 GI: 17017250 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

1 (bases 1 to 4482) 
Ivanova,N.B. and Lemischka, I . R. 

The global gene expression profiling of the hematopoietic stem cell 
Unpublished 

2 (bases 1 to 4482) 
Ivanova,N.B. and Lemischka, I . R. 
Direct Submission 

Submitted (25-OCT-2001 ) Molecular Biology, Princeton University, 
Washington Road, Princeton, NJ 08544, USA 

Location/Qualifiers 

1. .4482 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="C57BL/6" 
/db_xref="taxon: 10090" 

/cell_type="AA4+Sca+Kit+Lin- hematopoietic stem cell" 

/tissue type=" liver " 

/dev_stage="14 day old fetus" 

1. .4482 

/gene="Megfl2" 



CDS 



338. .3442 
/gene="Megfl2" 

/note="contains signal peptide , 13.5 EGF-like-domains and 

transmembrane domain" 

/ codon_start=l 

/product="MEGF12" 

/protein_id="AAL33583. 1" 

/db_xref="GI : 17017251" 

/ trans la tion="MPLCPLLLLAXGLRLTGTLNSNDPNVCTFWESFTTTTKESHLRP 
FSLLPAESCHRPWEDPHTCAQPTWYRTVYRQWKMDSRPRLQCCRGYYESRGACVPL 
CAQECVHGRCVAPNQCQCAPGWRGGDCSSECAPGMWGPQCDKFCHCGNNSSCDPKSGT 
CFCPSGLQPPNCLQPCPAGHYGPACQFDCQCYGASCDPQDGACFCPPGRAGPSCNVPC 
SQGTDGFFCPRTYPCQNGGVPQGSQGSCSCPPGWMGVICSLPCPEGFHGPNCTQECRC 
HNGGLCDRFTGQCHCAPGYIGDRCQEECPVGRFGQDCAETCDCAPGARCFPANGACLC 
EHGFTGDRCTERLCPDGRYGLSCQEPCTCDPEHSLSCHPMHGECSCQPGWAGLHCNES 
CPQDTHGPGCQEHCLCLHGGLCLADSGLCRCAPGYTGPHCANLCPPDTYGINCSSRCS 
CENAIACSPIDGTCICKEGWQRGNCSVPCPLGTWGFNCNASCQC7VHDGVCSPQTGACT 
CTPGWHGAHCQLPCPKGQFGEGCASVCDCDHSDGCDPVHGQCRCQAGWMGTRCHLPCP 
EGFWGANCSNTCTCKNGGTCVSENGNCVCAPGFRGPSCQRPCPPGRYGKRCVQCKCNN 
NHSSCHPSDGTCSCLAGWTGPDCSEACPPGHWGLKCSQLCQCHHGGTCHPQDGSCICT 
PGWTGPNCLEGCPPRMFGVNCSQLCQCDLGEMCHPETGACVCPPGHSGADCKMGSQES 
FTIMPT S PVTHNS LGAVI GI AVLGTLWALI ALFI GYRQWQKGKEHEHLAVAYSTGRL 
DGSD.YVMPDVSPSYSHYYSNPSYHTLSQCSPNPPPPNKVPGSQLFVSSQAPERPSRAH 
GRENHVTLPADWKHRREPHERGASHLDRSYSCSYSHRNGPGPFCHKGPISEEGLGASV 
MSLSSENPYATIRDLPSLPGEPRESGYVEMKGPPSVSPPRQSLHLRDRQQRQLQPQRD 
SGTYEQPSPLSHNEESLGSTPPLPPGLPPGHYDSPKNSHIPGHYDLPPVRHPPSPPSR 
RQDR" 



ORIGIN 



Query Match 18.9%; 
Best Local Similarity 57.1%; 
Matches 1239; Conservative 



Score 646.2; DB 10; Length 4482; 
Pred. No. 5e-188; 
0; Mismatches 918; Indels 12; 



Gaps 



3; 



Qy 



Db 



74 CT CT GAATCT T GAAGAC CCTAAT GT GT GTAGCC ACT GGGAAAGCTACT C AGT GACT GT GC 133 

I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I III 

390 CACT CAACT C CAAT GAT CCCAAT GT CT GT AC CT T TT GGGAAAG CT T CAC C ACGAC C ACT A 449 



Qy 

Db 

Qy 

Db 



134 AAGAGT CATACCCACAT CCCTTT GAT CAAAT TTACTACACGAGCTGCACTGACATTC 190 

I I I I I I I I I I I I I I I I I I I I I I I I 

450 AGGAGTCCCACCTGCGCCCCTTCAGCCTGCTCCCAGCTGAGTCCTGCCACAGGCCCTGGG 509 

191 TAAACTGGTTTAAATGCACGCGGCACAGAGTCAGCTATCGGACAGCCTATCGACATGGGG 250 

I I I II I I I I I II I I I I I I I I I I I I I I I I I I 

510 AGGACCCCCACACCTGTGCCCAGCCTACGGTTGTCTACCGGACTGTGTACCGTCAGGTGG 569 



Qy 

Db 

Qy 

Db 

Qy 

Db 



251 AGAAGACTAT GTATAGGCGCAAGT CT CAGT GTT GTC CT GGATTT TATGAAAGCGGGGAAA 310 

I I II I I II I I I I I I I I III I I I I I I I I I 

570 TGAAGATGGACTCCCGCCCACGCCTGCAGTGCTGTAGGGGTTACTACGAGAGCAGAGGGG 629 

311 TGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCTCCAAACACCT 370 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
630 CCTGTGTCCCACTCTGTGCCCAGGAGTGTGTCCATGGTCGCTGTGTGGCTCCGAATCAGT 689 

371 GTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGATGGTGATCACT 4 30 

I I I I I I II I I I I I I I I I I I I I I I I I I I II I I II I I I 

690 GCCAGTGTGCACCAGGCTGGCGGGGTGGCGACTGCTCCAGCGAGTGTGCCCCGGGAATGT 749 



431 GGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGCAACCCCATCA 490 

I I I I I I I I I I II I I I I I I I II I M I I I I I I 

750 GGGGAC CACAGT GT GAC AAGT T CT GCC ACT GT GGCAACAACAGT T C CT GT GAT C C CAAGA 8 09 

4 91 CCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGACCGCTGTGAGC 550 

I I I I I I I II I I I I I I I I I I I I I I I I I I 

810 GTGGGACGTGCTTTTGCCCCTCTGGCCTGCAGCCCCCCAACTGCCTTCAGCCCTGCCCTG 8 69 

551 AGGGCAC CT AT GGT AAC GACT GT CAT C AGAGAT G CC AGT GC CAGAAT GGAG CCACCT GCG 610 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 

870 CCGGCCACTATGGTCCTGCCTGCCAGTTTGATTGCCAGTGC TATGGGGCATCCTGTG 926 

611 ACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTCTGTGAGGATC 670 

Ml I Ml M I I I I I I I I I I I I III II MM I I I 

927 ACCCCCAGGATGGAGCCTGTTTCTGCCCTCCAGGGAGAGCAGGACCCAGCTGTAATGTGC 986 

671 TTTGTCCTCCTG GT AAAC AT GGT C CACAGT GT GAG C AGAGAT GC C CT T GT C AAAAT GGAG 730 

I II I I I I M I I Ml I M I I 

987 CCTGTTCACAGGGCACTGATGGCTTCTTCTGCCCCAGAACCTATCCTTGCCAAAATGGAG 104 6 

731 GAGT GTGT CAT CACGTCACT GGAGAAT GCT CTT GC C CTT CT GGCT GGAT GGGCACAGT GT 790 

Ml III I I II I I I I I II I I I I I I II 

1047 GTGTTCCTCAGGGCTCCCAAGGCTCCTGCAGCTGCCCACCGGGCTGGATGGGTGTCATTT 1106 

791 GTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAATGCCAGTGCC 850 

II I I II I I I II I M II I I I I I I I I I I M I M M I II I I I I I 

1107 GT T C C CT GC CAT GCC C AGAGGGT T T C CAT GGAC C CAACT GT ACT CAG GAAT GT C GCT GCC 1166 

851 AT AAT GGAG GGAC GT GT GAT GCT GC C AC AGGC C AAT GT CAT T G C AGT C CAG GAT AC AC AG 910 

I M II II Mill I I I I I I I I II I I I M II I I I I 
1167 ACAACGGTGGCCTCTGTGACAGGTTTACTGGGCAGTGCCACTGTGCTCCTGGCTATATCG 1226 

911 GGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGTGCTGAGACCT 97 0 

II II M I I I II I II II I II M II I I II II I I I M II M I II I I 

1227 GGGATCGGTGCCAAGAAGAGTGCCCCGTGGGCCGCTTCGGTCAAGACTGTGCTGAGACCT 1286 

971 GCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTCTGTGAAGCAG 1030 
I | MM II I III I I I I I I I I I I II II II M I 

1287 GTGACTGTGCTCCTGGCGCCCGTTGCTTTCCTGCTAATGGCGCGTGTCTGTGTGAACATG 134 6 

1031 GCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAAT 1090 

MM I MUM I II I II M I M I I I I I II I II II I 

1347 GCTTCACAGGCGACCGCTGCACTGAGCGCCTCTGTCCGGATGGCCGCTATGGTCTGAGCT 1406 

1091 GT GAC AAAC G GT GT C C CT GC CACT T G GAAAAC ACT C AT AGCT GT CAC C C CAT GT CT GGAG 1150 

I | II II I I II I I I I II I I I M I I I II I I II I II I I I I 
1407 GCCAGGAGCCCTGCACCTGCGACCCAGAACACAGTCTCAGCTGCCACCCGATGCACGGCG 1466 

1151 AGT GT GC CT G C AAGC C GGG CT GGT CAG GACT C TACT GT AAT GAGAC AT GT T CT C CT GGAT 1210 

I I I I II I I I I II I I I I M I I I II I II III I 

1467 AGTGCTCCTGCCAGCCAGGTTGGGCGGGCCTCCACTGCAACGAGAGCTGCCCTCAGGACA 1526 

1211 T CTACGGGGAAGCTT GCCAGCAGAT CTGCAGCT GCCAAAATGGGGCAGACTGTGACAGT G 1270 

III I I I I I M I I II I I Mil I I I I III I 

1527 CGCATGGCCCGGGCTGCCAGGAGCACTGCCTCTGTCTGCACGGAGGGCTCTGCCTTGCCG 1586 

1271 TGACTGGAAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCTACCCCATGCC 1330 



Db 

QY 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 
Db 

Qy 



I II Ml 1 1 I I I I I I I I I I III I I I I I I I I I I I I I I I 

1587 ACAGCGGCCTCTGCCGGTGCGCGCCGGGATACACGGGACCTCACTGCGCTAACCTATGTC 164 6 

1331 CTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCT 1390 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I N 

1647 CACCGGACACTTACGGGATCAACTGTTCCTCCCGCTGCTCCTGTGAAAATGCCATTGCCT 1706 

1391 GCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCA 1450 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

17 07 GCTCTCCCATCGACGGCACGTGCATCTGCAAGGAAGGTTGGCAGCGTGGTAACTGCTCTG 17 66 

1451 TCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACG 1510 

I I I I I I I I I I I I I I I I I I! I I I I I I I I I I I I I I I I I I I 

17 67 TTCCCTGTCCCCTTGGCACCTGGGGCTTCAATTGCAATGCCAGTTGCCAGTGTGCCCACG 1826 

1511 GGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGA 1570 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I III 

1827 ACGGAGTCTGCAGCCCCCAAACTGGAGCCTGTACTTGCACCCCTGGGTGGCATGGTGCTC 1886 

1571 AATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACT 1630 

I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I 

18 87 ACTGCCAGCTTCCCTGCCCGAAGGGACAGTTTGGTGAAGGCTGTGCCAGTGTCTGTGACT 1946 

1631 GCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCGGGATGGT 1690 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1947 GTGAC C ACT CT GAT G GC T GT GAC C CT GT T CAT GG AC AGT GC C GAT GT CAGG CT GGT T G GA 2 006 

1691 CAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGC 1750 

II I I I I II II I I I I I I I I I I I I I I I I I I I 

2007 TGGGCACACGCTGCCACCTGCCTTGCCCGGAGGGCTTTTGGGGAGCCAACTGCAGTAACA 2 066 

1751 CCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCAC 1810 

I I I I III II I I II I I I I II I I I I I I I I II I I I I I I I I I M I 

2067 CCTGTACCTGCAAGAATGGTGGTACCTGTGTGTCTGAGAATGGCAACTGCGTGTGCGCAC 2126 

1811 CAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCT 187 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Mill 

2127 CAGGGTTCCGAGGCCCCTCCTGCCAGAGGCCCTGCCCGCCTGGTCGCTATGGCAAACGCT 2186 



1871 



1930 



GCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCACATCACCGGCC 

I II I I I I I I I I I I I I I I I I I III 

2187 GTGTGCA AT GCAAGT GT AAC AACAAC CAT T C T T C CT GC CAC C CAT C GGAC GGGA 2240 

1931 TGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCA 1990 

II I I I I I I I I I I I I I I I I I I I I I I III I I I I I I III 
2241 CCTGCTCCTGCCTGGCGGGCTGGACAGGCCCTGACTGCTCCGAGGCATGTCCCCCAGGCC 2 300 

1991 GATTTGGGAAAAACTGTGCAGGAATTTGTACCTGCACCAACAACGGAACCTGTAACCCCA 2 050 

I II MM I I I I I M I I I I M I I M I I I 

2301 ACTGGGGACTCAAATGCTCCCAACTCTGCCAGTGTCATCATGGTGGGACCTGCCACCCCC 2360 

2051 TTGACAGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCTCAACCATGTC 2110 

II I III II II I I II II II I I I I I II II III 

2361 AGGATGGGAGCTGTATCTGCACGCCAGGCTGGACTGGACCCAACTGCTTGGAAGGCTGCC 2420 

2111 CACCTGCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCT 2170 

II II I II I II I I I Ml II III M I II I II 



Db 



2421 C ACCAAGAAT GT T T GGT GT CAACT GCT C C CAG CTAT GT C AGT GT GAT CT C GGAGAGAT GT 2 4 80 



Qy 2171 GCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTCTACTGCACTC 2230 

Mill I II I I I I I I I I I I I I III I I I I I I 

Db 2481 GCCACCCAGAGACTGGGGCTTGTGTCTGTCCCCCAGGACACAGTGGTGCAGACTGCAAAA 2540 

Qy 2231 AGAGATGTC 2239 

I I I I I 
Db 2541 TGGGAAGCC 2549 



RESULT 13 

AX492979 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 

FEATURES 

source 



AX492979 3574 bp DNA linear PAT 26-SEP-2002 

Sequence 16 from Patent WO02059312. 

AX492979 

AX492979.1 GI: 23338634 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 

Kallick, D.A. , Lee,S., Xu,Y., Yao,M. G. , Yue, H . , Bandman , O . B . , 

Burford,N., Gandhi,A.R., Graul,R.C, Lal,P.G., Lu,D.A., Lu,Y., 

Tang,T.Y., Duggan,B.M., Gietzen, K. J. , Hillman/J. L . , Honchell, CD., 

Ramkumar, J. , Walia,N.K. and Warren, B. A. 

Cell adhesion proteins 

Patent: WO 02059312-A 16 01-AUG-2002; 

INCYTE GENOMICS INC (US) 

Location/Qualifiers 

1. .3574 

/organism= M Homo sapiens" 
/mol_type="unas signed DNA" 
/db_xref="taxon: 9606" 
/note="Incyte ID No: 4097936CB1" 



ORIGIN 



Query Match 17.5%; 
Best Local Similarity 57.4%; 
Matches 1165; Conservative 



Score 600.6; DB 6; 
Pred. No. 6.8e-174; 
0; Mismatches 84 9; 



Length 3574; 
Indels 15; Gaps 



4; 



Qy 



Db 



62 GGACAGCATCACCTCTGAATCTTGAAGACCCTAATGTGTGTAGCCACTGGGAAAGCTACT 121 
II II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 

28 9 GGCTGGCTGGAACTCTCAACCCCAGTGATCCCAATACCTGCAGCTTCTGGGAAAGCTTCA 34 8 



Qy 

Db 

Qy 

Db 

Qy 

Db 



122 C AGT GAC T GT GC AAG AGT CAT AC C C AC AT C C C T T T GAT C AAAT T T AC T AC AC GAG C T G C - 18 0 

I II I I I I I I I I I I I I I I I I I II I I I I I 

34 9 CTACCACCACCAAGGAGTCCCACTCCCGCCCCTTCAGCCTGCTCCCCTCAGAGCCCTGCG 408 

181 — ACT GACATTCTAAACTGGTTTAAAT GCACGCGGCACAGAGT CAGCT AT CGGACAGCCT 238 

I I II 111 I I II II II I I I I I I I I I 

4 09 AGCGGCCCTGGGAGGGCCCCCATACTTGCCCCCAGCCCACGGTTGTATACCGGACCGTGT 468 

239 AT CGACATGGGGAGAAGACTAT GTATAGGCGCAAGT CT CAGT GTT GT CCT GGATTTTAT G 298 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

4 69 ACCGTCAGGTGGTGAAGACGGACCACCGCCAGCGCCTGCAGTGCTGCCATGGCTTCTATG 528 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



299 



529 



359 



589 



419 



649 



479 



709 



539 



769 



599 



826 



659 



886 



719 



946 



779 



AAAGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTG 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AGAGCAGGGGGTTCTGTGTCCCGCTCTGTGCCCAGGAGTGTGTCCATGGCCGTTGTGTGG 

CTCCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCG 

I I I I I I I I I I I II I I I I I II I I I I I I I I II I I II I I I IM 

CACCCAATCAGTGCCAATGTGTGCCAGGCTGGCGGGGCGACGACTGTTCCAGTGAGTGTG 

ATGGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGT 

I I I I I I I I I I I I I I I I I I III II M 

CCCCAGGAATGTGGGGGCCACAGTGTGACAAGCCCTGCAGCTGCGGCAACAACAGCTCGT 

GCAACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGG 

I I I I I I I I I I I II II I I I M III I I I I I I 

GTGATCCCAAGAGTGGGGTATGTTCTTGCCCTTCTGGTCTGCAGCCCCCGAACTGCCTTC 

ACCGCTGTGAGCAGGGCACCTATGGTAACGACTGT CAT CAGAGAT GC CAGTGCCAGAAT G 

I I I I I I I I I I I I I I I I I I I I I I I II 

AGCCCTGTACCCCTGGCTACTATGGCCCTGCCTGCCAGTTCCGCTGCCAGTGCC ATG 

GAGCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCT 

III I I I I I I I I I I I I I I III I 1 I 1 I I II I 

GGGCACCCTGCGATCCCCAGACTGGAGCCTGCTTCTGCCCCGCAGAGAGAACTGGGCCCA 

T C T GT GAG GAT CTTTGTCCTCCT G GT AAAC AT G GT C C AC AGT GT GAG CAGAGAT G C C C T T 

I I I I I I I I I I I I I I I III II I I I I I 

GCTGTGACGTGTCCTGTTCCCAGGGCACTTCTGGCTTCTTCTGCCCCAGCACCCATCCTT 



358 



588 



418 



648 



478 



708 



538 



768 



598 



825 



658 



885 



718 



945 



778 



GTCAAAATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGA 

I I I I I I II I II I M II I M I I 

GCCAAAATGGAGGTGTCTTCCAAACCCCACAGGGCTCCTGCAGCTGCCCCCCTGGCTGGA 1005 



838 



TGGGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAG 

I I I I II I III I I I I I I I I I I I I I I III I I I I I I I I I I I 

1006 TGGGCACCATCTGCTCCCTGCCCTGCCCAGAGGGCTTTCACGGACCCAACTGCTCCCAGG 1065 



839 AAT GCC AGTG C CAT AAT GGAGGGACGT GT GAT G CT GC CAC AGG C CAAT GT CAT T GCAGT C 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III II 

1066 AATGTCGCTGCCACAACGGCGGCCTCTGTGACCGATTCACTGGGCAGTGCCGCTGCGCTC 



898 



1125 



958 



899 CAGGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCT 

I I I I I I I I I I I I I I I I I I I I III I I I I I M M M II IN M 

1126 CGGGTTACACTGGGGATCGGTGCCGGGAGGAGTGCCCGGTGGGCCGCTTTGGGCAGGACT 1185 

959 GT GCT GAGAC CT G C CAGT GT GT C AAC GGAG GGAAGT GT T AC CAC GT GAG C GG C GC AT GCC 1018 
I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

1186 GTGCTGAGACGTGCGACTGCGCCCCGGACGCCCGTTGCTTCCCGGCCAACGGCGCATGTC 124 5 

1019 TCTGTG7\AGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCT 1078 

I II III I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I III 
1246 TGTGCGAACACGGCTTCACTGGGGACCGCTGCACGGATCGCCTCTGCCCCGACGGCTTCT 1305 

1079 AC GGCAT CAAATGT GACAAACGGT GTCCCT GCCACTT GGAAAAC ACT CAT AGCT GT CACC 1138 

I I I I II I II Mill Ml Ml I I I I I I 1 I I I 

1306 ACGGTCTCAGCTGCCAGGCCCCCTGCACCTGCGACCGGGAGCACAGCCTCAGCTGCCACC 1365 



Qy 


1139 


CCAT GTCT GGAGAGT GTGCCT GCAAGCCGGGCTGGT CAGGACT CT ACT GTAAT GAGACAT 

1 Ml II 1 II II 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 

CGATGAACGGGGAGTGCTCCTGCCTGCCGGGCTGGGCGGGCCTCCACTGCAACGAGAGCT 


1198 


Db 


1366 


1425 


Qy 


1199 


GTTCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAG 

1 II 1 1 1 II II Mill 1 M 1 1 1 1 1 

GCCCGCAGGACACGCATGGGCCAGGGTGCCAGGAGCACTGTCTCTGCCTGCACGGTGGCG 


1258 


Db 


1426 


1485 


Qy 


1259 


ACT GT GACAGT GT GACT GGAAAGT GCACCTGT GCCCCAGGATT CAAAGGAATT GACT GCT 

Ml 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 
TCTGCCAGGCTACCAGCGGCCTCTGTCAGTGCGCGCCGGGTTACACGGGCCCTCACTGTG 


1318 


Db 


1486 


1545 


Qy 


1319 


CTACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAA 

Ml 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

CTAGTCTTTGTCCTCCTGACACCTACGGTGTCAACTGTTCTGCACGCTGCTCATGTGAAA 


1378 


Db 


1546 


1605 


Qy 


1379 


ATGATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGG 

I M 1 1 1 1 1 1 1 1 1 1 Mill II 1 II 1 1 II M 1 II 1 M II 

ATGCCATCGCCTGCTCACCCATCGACGGCGAGTGCGTCTGCAAGGAAGGTTGGCAGCGTG 


1438 


Db 


1606 


1665 


Qy 


1439 


TGGACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCC 

I 1 M 1 1 1 1 M M II 1 1 II II 1 II 1 1 1 1 1 1 1 M 1 1 

GTAACTGCTCTGTGCCCTGCCCACCCGGAACCTGGGGCTTCAGTTGCAATGCCAGCTGCC 


1498 


Db 


1666 


1725 


Qy 


1499 


AGTGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGAT 

I II 1 1 1 1 II II II II 1 1 II 1 1 1 . M 1 1 1 1 1 1 1 II II 1 1 

AGTGTGCCCATGAGGCAGTCTGCAGCCCCCAAACTGGAGCCTGTACCTGCACCCCTGGGT 


1558 


Db 


1726 


1785 


Qy 


1559 


GGCGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTG 

III II II 1 1 II 1 II 1 1 1 II 1 1 1 1 1 M II 1 1 1 M 

GGCATGGGGCCCACTGCCAGCTGCCCTGTCCGAAGGGGCAGTTTGGAGAAGGTTGTGCCA 


1618 


Db 


1786 


1845 


Qy 


1619 


AGCGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCC 

1 M 1 1 1 II II II M 1 1 II II II 1 1 1 II 1 1 1 1 1 1 1 MM 

GTCGCTGTGACTGTGACCACTCTGATGGCTGTGACCCTGTTCATGGACGCTGTCAGTGCC 


1678 


Db 


1846 


1905 


Qy 


1679 


TCCCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCA 

1 1 1 1 1 1 MM II 1 1 1 II II 1 M 1 1 1 1 Mill II 

AGGCTGGCTGGATGGGTGCCCGCTGCCACCTGTCCTGCCCTGAGGGCTTATGGGGAGTCA 


1738 


Db 


1906 


1965 


Qy 


1739 


ACTGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCT 

I 1 II 1 1 1 1 1 1 II 1 II M 1 1 1 II 1 M 1 II II 1 II 1 1 II II 

AC T GTAGCAACACCT GCACCT GC AAGAAT G G GGG CAC C T GT CT C C CT GAGAAT G GCAACT 


1798 


Db 


1966 


2025 


Qy 


1799 


GCGAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTT 

1 M 1 1 1 II 1 1 1 1 1 1 II II 1 M 1 II 1 1 1 1 1 II 1 III II 1 1 1 1 

GCGTGTGTGCACCCGGATTCCGGGGCCCCTCCTGCCAGAGATCCTGTCAGCCTGGCCGCT 


1858 


Db 


2026 


2085 


Qy 


1859 


ATGGGCATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACC 

1 II 1 I 1 II II 1 1 II 1 1 II 1 M 1 1 1 

AT GGCAAAC GCT GT GT G C CCTGCAAGTGCGCTAACCACTCC TTCTGCCACC 


1918 


Db 


2086 


2136 


Qy 


1919 


ACATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGT 

I I Ml II 1 II II II II 1 II 1 1 1 M M M M II 1 1 

CCTCGAACGGGACCTGCTACTGCCTGGCTGGCTGGACAGGCCCCGACTGCTCCCAGCGCT 


1978 


Db 


2137 


2196 


Qy 


1979 


GT C C CAGT GG CAGATT T GGGAAAAACT GT GCAG GAAT T T GT AC CT GC ACCAAC AACG GAA 


2038 



2197 GCCCTCTGGGGACATTTGGTGCTAACTGCTCCCAGCCATGCCAGTGTGGTCCTGGAGAAA 2256 

2039 CCTGTAACCCCATTGACAGATCTTGTCAGTGTTACCCCGGTTGGATTGG 2087 

II I I I I I I I I I III Mill I 111 

2257 AGTGCCACCCAGAGACTGGGGCCTGTGTATGTCCCCCAGGGCACAGTGG 2305 



RESULT 14 

RATORFD 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 



JOURNAL 
MEDLINE 
PUBMED 
COMMENT 

FEATURES 

source 



mRNA 
CDS 



RATORFD 660 bp mRNA linear ROD 09-AUG-1996 

Rattus norvegicus (clone REM4) ORF mRNA, partial cds . 

L41686 

L41686.1 GI:780366 
monoclonal autoantibody. 
Rattus norvegicus (Norway rat) 
Rattus norvegicus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 
Rattus . 

1 (bases 1 to 660) 

Asakura,K., Pogulis , R. J . , Pease, L.R. and Rodriguez, M. 

A monoclonal autoantibody , which promotes central, nervous system 

remyelination is highly polyreactive to multiple known and novel 

antigens 

J. Neuroimmunol . 65 (1), 11-19 (1996) 

96235155 

8642059 

Original source text: Rattus norvegicus (strain Holzman) cDNA to 
mRNA. 

Location/Qualifiers 
1. .660 

/organism="Rattus norvegicus" 

/mol_type="mRNA" 

/strain="Holzman" 

/db_xref="taxon: 10116" 

/clone="REM#4" 

/tissue_type="brain" 

/dev_stage="neonate" 

<1. .>660 

<1. .>660 

/note-"ORF" 

/codon_start=l 

/protein_id="AAB05844 . 1" 

/db_xref="GI : 780367 " 

/trans lation="RGATCDHITGECRCSPGYTGAFCEDLCPPGKHGPQCEQRCPCQN 
GGVCHHVTGECSCPSGWMGTVCGQPCPEGRFGKNCSQECQCHNGGACDAATGQCHCSP 
GYTGERCQDECPVGTYGVRCAETCRCVNGGKCYHVSGTCLCEAGFSGEFCEARLCPEG 
LYGIKCDKRCPCHLDNTHSCHPMSGECGCKPGWSGLYCNETCSPGFYGEACQQICSCQ 
NG" 



ORIGIN 



Query Match 16.0%; 
Best Local Similarity 89.6%; 
Matches 589; Conservative 



Score 548.2; DB 10; 
Pred. No. 8.7e-158; 
0; Mismatches 68; 



Length 660; 
Indels 0; 



Gaps 



0; 



598 GGAGC CACCT GC GAC CAC GT CACGGG GGAAT GC C GCT GC C CAC CAGGAT AC AC C GGAGC C 657 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 4 GGAGCGACCTGTGACCACATCACTGGGGAATGTCGTTGTTCACCTGGATACACTGGAGCC 63 

Qy 658 T T CT GT GAGGAT CTTTGTCCT C CT GGT AAAC AT GGT C C AC AGT GT GAG C AGAGAT G C C CT 717 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 64 TTCTGTGAGGATCTGTGTCCTCCTGGCAAACATGGCCCACAGTGTGAGCAGAGATGTCCC 123 

Qy 718 TGTCAAAATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGG 777 

II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 124 TGCCAAAATGGAGGCGTGTGCCATCATGTCACTGGAGAGTGCTCCTGCCCTTCTGGCTGG 183 

Qy 778 ATGGGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAA 837 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 184 ATGGGCACAGTGTGTGGTCAGCCCTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAA 243 

Qy 838 GAATGCCAGT GCCAT AAT GGAGGGACGT GT GAT GCT GCCACAGGCCAATGT CATT GCAGT 897 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 244 GAATGTCAGTGCCACAATGGAGGAGCGTGTGATGCTGCCACAGGCCAATGTCACTGCAGC 303 

Qy 898 CCAGGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTC 957 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 304 C C AGGAT AC AC AG G G GAAC G GT GT C AG GAT G AAT GTCCTGTTG GAAC CT AT G GAGT T C G C 363 

Qy 958 TGTGCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGC 1017 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 364 T GT GCT GAGACCT GCAGGT GT GT TAAT GGAGGGAAGT GTTAC C AT GT GAG CGGCAC AT GC 423 

Qy 1018 CTCTGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTC 1077 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 424 CTATGCGAAGCAGGTTTTTCAGGTGAATTTTGCGAGGCTCGTCTGTGCCCTGAGGGGCTT 483 

Qy 1078 T ACGGCAT CAAAT GT GACAAACGGTGT CCCT GCCACTTGGAAAACACT CAT AGCT GT CAC 1137 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 484 T ACGGCAT CAAAT GT GACAAGCGGT GCCCCTGCCACCTGGACAATACTCATAGCTGT CAT 543 

Qy 1138 CCCATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACA 1197 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I III 

Db 544 CCCATGTCTGGAGAGTGTGGCTGCAAGCCGGGTTGGTCTGGACTGTACTGTAATGAAACA 603 

Qy 1198 TGTTCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGG 1254 

II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I III 

Db 604 TGCTCTCCTGGATTCTACGGGGAAGCTTGCCAACAGATCTGCAGCTGCCAAAACGGG 660 



RESULT 15 

AX079681 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



AX079681 
Sequence 
AX079681 
AX079681, 



425 



632 bp 
from Patent WO0107611. 

GI:13159250 



DNA 



linear PAT 22-FEB-2001 



(human) 



Homo sapiens 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 

Baker, K. P., Goddard,A. and Wood, W.I 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



TITLE Human polypeptides and methods for the use thereof 

JOURNAL Patent: WO 0107611-A 425 01-FEB-2001; 
Genentech, Inc. (US) 
FEATURES Location/Qualifiers 
source 1. .632 

/organism="Homo sapiens" 
/mol_type="unassigned DNA" 
/db__xref="taxon:9606" 

ORIGIN 



Query Match 15.6%; Score 534.8; DB 6; Length 632; 

Best Local Similarity 95.3%; Pred. No. 1.3e-153; 

Matches 593; Conservative 0; Mismatches 24; Indels 5; Gaps 4; 



Qy 


7 


ATTTCTTTGAACTCATGCCTGAGCTTTATTT — GT T TAT T GTT AT G- CC ACT GGAT T GG G 


63 




M 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1 


AT TTT T T GAAAT TAAT GCN T GAGCT T TAT TTT GT TTAAT T GT TAT G C CC ACT GGAT T GGG 


60 


Qy 


64 


AC- AGCAT CACCT CT GAAT CTT GAAGACCCTAAT GT GTGT AGC CACT GGGAAAGCTACT C 


122 




II I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


61 


ACAAG CAT CACCT CT GAAT TT T GAAGAC C T TAAT GT GT GT TAGC C AT T GNAAAG CT ACT C 


120 


Qy 


123 


AGT GACT GT GCAAGAGT CAT AC C C AC AT C C CT T T GAT CAAAT T TACT AC AC GAG CT G CAC 


182 




I 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


AAGT GCT GT GCAAGAGT CAT AC C CAC AT C CCT T T GAT CAAAT T T ACTAC AC GAGCT GCAC 


180 


Qy 


183 


T GACATTCTAAACTGGTTTAAATGCACGCGGCACAGAGT CAGCTAT CGGACAGCCTAT CG 


242 




I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


T GACATTCTAAACTGGTTTAAATGCACGCGGCACAGAGT CAGCTAT CGGACAGCCTATC G 


240 


Qy 


243 


ACAT GGGGAGAAGACT AT GTAT AGGCGCAAGT CT CAGT GTT GT CCTGGATTTT AT GAAAG 


302 




I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


241 


ACAT GGGGAGAAGAC TAT G- ACAGGC GCAAGT CT C AGT GT T GT CCT GGAT T T TAT GAAAG 


299 


Qy 


303 


CGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCTCC 


362 




I I I I 1 I 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


300 


CGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCTCC 


359 


Qy 


363 


AAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGATGG 


422 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


360 


AAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGATGG 


419 


Qy 


423 


TGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGCAA 


482 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


420 


TGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGCAA 


479 


Qy 


483 


CCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGACCG 


542 




II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


480 


CCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGACCG 


539 


Qy 


543 


CTGT GAGCAGG GCAC CTAT GGT AAC GACT GT CAT C AGAGAT GC CAGT G C CAGAAT G GAGC 


602 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


540 


CT GT GAGCAGGGCAC CTAT GGT AAC GACT GT CAT CAGAGAT GCCAAT GC CAGAAT GGAGC 


599 


Qy 


603 


CACCTGCGACCACGTCACGGGG 624 




Db 


600 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 

CAC CT GC GAC CACAT CAC GGG G 621 





Search completed: March 30, 2004, 06:14:15 
Job time : 13036 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 

Run on: March 29, 2004, 23:13:51 ; Search time 1266 Seconds 

(without alignments) 
11486.242 Million cell updates/sec 



Title: US-10-092-390-1 
Perfect score: 3423 

Sequence: 1 atggttatttctttgaactc gcagcagcagcagtgaatga 3423 

Scoring table: IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

Searched: 3373863 seqs, 2124099041 residues 

Total number of hits satisfying chosen parameters: 6747726 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : N__Geneseq_29 Jan04 : * 

1 : geneseqnl980s : * 
2: geneseqnl990s : * 
3: geneseqn2000s : * 
4 : geneseqn2001as : * 
5: geneseqn2001bs : * 
6: geneseqn2002s : * 
7: geneseqn2003as : * 
8: geneseqn2003bs:* 
9: geneseqn2003cs : * 
10 : geneseqn2004s : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 



1 


3423 


100. 


0 


3423 


6 


AAD46318 


Aad4 6318 


Human 


EGF 


2 


3421.4 


100. 


0 


7522 


9 


ADE71243 


Ade7124 3 


Novel 


hum 


3 


3419.8 


99. 


9 


7522 


9 


ADD18689 


Addl8689 


Human 


dis 


4 


1760 


51. 


4 


1761 


6 


AAD46319 


Aad46319 


Human 


EGF 


5 


1625.4 


47. 


5 


2909 


5 


AAS72220 


Aas72220 


DNA encod 


6 


1425.8 


41. 


7 


1448 


2 


AAX19959 


Aaxl9959 


Human 


Tan 


7 


1049.4 


30. 


7 


1273 


7 


ABZ36212 


Abz36212 


Human 


sec 





8 


618 . 4 


18 . 


1 


1074 


5 


AAS 9 1 o z 6 


n A cQl R96 


niMA encod 




9 


608 . 4 


17 . 


8 


3567 


4 


AAF27 /91 


Aa L Z / / z? J. 


R^t TANGO 




10 


600. 6 


17 . 


5 


3574 


6 


TV T"V TV /~\ /*\ f~\ C f\ 

ABAO 0059 




PADHP-6 C 




11 


581.6 


17 . 


0 


1402 


5 


AAS o d / 4 o 


J-\a bOO/IO 


DNA encod 




12 


563. 8 


16. 


5 


936 


5 


TV TV O f ~J O O "1 O 

AAS72218 


A-a c7991 ft 

Aa S / Z Z J. 0 


HMA pnrnd 




13 


563. 8 


16. 


5 


936 


7 


• ACD058 89 


acuuj o o r 


Mnt re* T \\ 1 1 m 

IwvCl I1U-1LL 




14 


563. 8 


16 . 


5 


936 


7 


ti t~\ a r r n n 

ACD05589 


ACaU DDOzf 


O JJ1N A trin^ w 




15 


554 


16. 


2 


2295 


5 


AAS 72219 


Aa S / 


riM A pnpoH 




16 


554 


16. 


2 


2295 


5 


AAS 91825 


7\-. c Q1 ft 9 ^ 


r^M A pn paH 




17 


540. 6 


15 . 


8 


1970 


4 


AAH348 84 


A3 flO'ioO £ l 


XlUILLciil LU1 




18 


534.8 


15. 


6 


632 


5 


"ft TV ft n n /" A ^ 

AAF93604 


Aa iyjoui 






19 


528. 6 


15. 


4 


3150 


4 


AAF2 77 8 8 


j a f 9770 Q 
Aa L Z / /DO 


T-T i ima n TAM 




20 


528.6 


15 . 


4 


5036 


4 


AAF277 87 


Aa I Z / / 0 / 


M UlUci.1.1 J.^-UM 




21 


476.4 


13. 


9 


1908 


4 


AAF277 92 


A^ -P977 Q9 

Aa r z / / y z 


KdL l/AlMkJ^ 




22 


427.4 


12 . 


5 


2220 


7 


ABX08773 




A v\ y^f t r\ f~~i /zi n 

AiiCjlOy cllc 




23 


424.8 


12. 


4 


1205 


4 


AAD02807 


Aaauz o u / 






24 


420.2 


12. 


3 


3887 


7 


ABX34750 


ADXo4 / jU 


numan iliu.u. 




25 


396. 8 


11. 


6 


7334 


7 


ABT33368 




1NUVA UiNr\ 




26 


395.4 


11. 


6 


2700 


4 


AAD02810 


AaaUZo iu 


lit 1LU U O C 




27 


393 


11. 


5 


4835 


7 


ABT33369 




INUVA iJINA 




28 


390 


11. 


4 


4508 


9 


ADD782 66 


Acta. / O Z D O 


Unman PCD 




29 


386.8 


11. 


3 


4733 


7 


ABT33366 


AdljjjDD 


JNUVA JJlN/\ 


c 


30 


386 


11. 


3 


3005 


6 


ABL53693 


AK1 RQCQQ 

AD IDobyo 


riUmari uui 




31 


383. 8 


11. 


2 


7319 


7 


ABT33365 


ADlojoDj 






32 


383.8 


11. 


2 


7337 


7 


ABT33364 




Mn\/Y nw a 




33 


377.6 


11. 


0 


390 


5 


AAS91824 


Aa s y i o z 4 


UJNA encou 




34 


377.6 


11. 


, 0 


390 


5 


AAS86742 


Aa sob / ft z 


"P\"KT A ^ Y~\ /~\ i~\ 

jjjna encoQ 




35 


371 


10. 


, 8 


412 


5 


AAS91823 


Aa s y 1 o z o 


djna encou 




36 


371 


10. 


, 8 


412 


5 


AAS86741 


A ^» o Q £7 /I 1 

Aa sod / 4 1 


T -1 ! "NT A £^ Y"t f* r\ 




37 


358.4 


10. 


. 5 


5049 


8 


■ft T"V "ft O 1 "1 O 

ADA21192 


A/-l 9 1 1 Q9 

Aaaz i i y z 


numan occ^ 




38 


326.6 


9. 


. 5 


5096 


8 


ADA21191 


Ada z 1 1 y i 


riuinan sec 




39 


315.4 


9. 


.2 


1316 


4 


ABL19879 


Aoiiyo /y 


Drosoph.il 




40 


272 


7 , 


. 9 


353 


3 


7\7\7\/1T IT O "1 

AAA4 lool 


T\cL d41JOi 


ilLllllCLiX O *3 


c 


41 


243 


7, 


.1 


3493 


4 


ABL19878 


Abll9878 


Drosophil 


c 


42 


243 


7. 


.1 


17478 


4 


ABL04034 


Abl04034 


Drosophil 




43 


202.4 


5 


. 9 


230 


3 


AAA44616 


Aaa44616 


Human sec 




44 


178.2 


5 


.2 


971 


5 


AAS72263 


Aas72263 


DNA encod 




45 


175 


5 


. 1 


1686 


4 


AAD02811 


Aad02811 


HMEIR04 c 



ALIGNMENTS 



RESULT 1 
AAD46318 

ID AAD46318 standard; cDNA; 3423 BP. 
XX 

AC AAD46318; 
XX 

DT 27-JAN-2003 (first entry) 
XX 

DE Human EGF-family protein encoding cDNA #1. 
XX 

KW Human; EGF-family protein; novel human protein; NHP; drug discovery; 

KW restriction fragment length polymorphism analysis; forensic biology; 

KW toxicity; infectious disease; biological disorder; medical disorder; 

KW mental disorder; gene therapy; gene; ss. 



XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT CDS 1. .3423 

FT /*tag= a 

FT /product= "Human EGF-family protein #1" 

XX 

PN WO200272611-A2. 
XX 

PD 19-SEP-2002. 
XX 

PF 06-MAR-2002; 2002WO-US007477 . 
XX 

PR 12-MAR-2001; 2001US-0275013P . 
XX 

PA (LEXI-) LEXICON GENETICS INC. 
XX 

PI Yu X, Miranda M; 
XX 

DR WPI; 2002-723315/78. 

DR P-PSDB; AAE27985. 
XX 

PT New novel human nucleic acids useful for e.g. identifying protein coding 

PT sequences and mapping unique genes to a particular chromosome, as DNA 

PT markers for restriction fragment length polymorphism analysis, or in 

PT forensic biology. 
XX 

PS Claim 1; Page 36-37; 42pp; English. 
XX 

CC The present sequence is a cDNA encoding human EGF-family protein, a novel 

CC human protein (NHP) . The NHP sequences are useful for mapping unique 

CC genes to a particular chromosome; as DNA markers for restriction fragment 

CC length polymorphism analysis; in forensic biology; in defining and 

CC monitoring both drug action and toxicity; in identifying, selecting and 

CC validating novel molecular targets for drug discovery; in microarrays or 

CC other assay formats to screen collections of genetic material from 

CC patients who have a particular medical condition. The NHP peptides, 

CC fusion proteins, antibodies, antagonists and agonists can be used for 

CC detecting mutant NHPs or inappropriately expressed NHPs for the diagnosis 

CC of disease; for screening drugs for treatment of symptomatic or 

CC phenotypic manifestations of perturbing the normal function of NHP in the 

CC body and to treat diseases including infectious, mental, biological, or 

CC medical diseases or disorders. They are also used in gene therapy 

XX 

SQ Sequence 3423 BP; 810 A; 904 C; 925 G; 784 T; 0 U; 0 Other; 

Query Match 100.0%; Score 3423; DB 6; Length 3423; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 3423; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 60 

| | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I II I I I I I I I I I I I I I I I I 

Db 1 ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 60 



Qy 



61 GGGACAGCATCACCT CTGAATCTTGAAGACCCTAAT GTGT GT AGCCACT GGGAAAGCTAC 12 0 
I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 


61 


G GGACAGC AT CAC CT CT GAAT CTT GAAGACC CTAAT GT GTGT AG C CAC T GGGAAAG CT AC 


120 


Qy 


121 


TCAGTGACTGTGCAAGAGTCATACCCACAibbbl 1 1 bAl LAAA1 1 lALlALALbAbLUjL 


X O \J 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


TCAGTGACTGTGCAAGAGTCATACCCACATCCCTTTGATCAAATTTACTACACGAGCTGC 


180 


Qy 


181 


ACTGACATTCTAAACTGGTTTAAATGCACGCGGCAbAbAbl bAGbl Al bbbACAbbbl Ai 


0 a n 

Z 4 U 




I | | | | | | | | I I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


ACTGACATTCTAAACTGGTTTAAAT GCAC GCGGCACAGAGT CAGCTAT CGGACAGCCTAT 


240 


Qy 


241 


CGACATGGGGAGAAGACTATGTATAGGCGCAAGTCTCAGTGTTGTCCTGGATTTTATGAA 






I 1 I 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 




Db 


241 


C GACAT G GGGAGAAGACTAT GT AT AGGC GCAAGTCT CAGT GT T GT CCT GGAT TTT AT GAA 


300 


Qy 


301 


AGC GGGGAAAT GT GT GT C CC C CACT GT G CT GAT AAAT GT GT C CAT GGT C GCT GT ATT G C T 


q £ n 




I I I I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 




Db 


301 


AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCT 


360 


Qy 


361 


CCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGAT 


/ion 
4z u 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 II 1 1 1 1 




Db 


361 


CCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGAT 


420 


Qy 


421 


GGTGATCACTGGGGTCCCCACTGCACCAGCCGGT.GCCAGTGCAAAAATGGGGCTCTGTGC 


/ion 
4 0 U 




| | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


421 


GGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGC 


480 


Qy 


481 


AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 


04 U 




M | | | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 




Db 


481 


AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 


540 


Qy 


541 


CGCT GT GAGCAG GGC AC CTAT GGT AAC GACT GT CAT CAGAGAT GC CAGT GC CAGAAT GGA 


Ann 

DUU 




I I 1 1 1 1 1 1 1 1 1 1 1 1 M II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 




Db 


541 


CGCT GT GAG C AGG G CAC CTAT GGT AAC GACT GT CAT CAGAGAT GCCAGT GC CAGAAT G GA 


600 


Qy 


601 


GC CACCT GC GAC CAC GT CAC GGGG GAAT GC C GCT GCC CAC CAGGATACAC C GGAGC CT T C 


bbU 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


601 


GC C AC CT GC GAC CAC GT CAC G G GG GAAT GCCGCTGCC CAC CAGGATACAC C GGAGC CT T C 


660 


Qy 


661 


TGTGAGGATCTTTGTCCTCCTGGTAAACATGGTCCACAGTGTGAbbAbAbAlbbbbl 1 bl 


790 




II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 




Db 


661 


TGTGAGGATCTTTGTCCTCCTGGTAAACATGGTCCACAGTGTGAGCAGAGATGCCCTTGT 


720 


Qy 


721 


CAAAATGGAGGAGTGTGTCATCACGTCACTGGAGAAI bbl bl 1 GUCCI 1 bl bbblbbAi b 


7 P, 0 

/ O VJ 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 II 1 1 1 




Db 


721 


CAAAATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGATG 


780 


Qy 


781 


GGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAbAAbl bl IbbtAAbAA 


p a n 

0 4 u 




1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 11 1 11 1 1 1 




Db 


781 


GGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAA 


840 


Qy 


841 


TGCCAGTGCCATAATGGAGGGACGTGTGATGCTGCCAbAbbbbAAl bl bAl 1 bbAbl bbA 


onn 




1 1 1 1 1 1 1 1 t 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 

I | || | | | 1 | 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 M M 1 1 M 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 




Db 


841 


TGCCAGTGCCATAATGGAGGGACGTGTGATGCTGCCACAGGCCAATGTCATTGCAGTCCA 


900 


Qy 


901 


GGAT AC AC AG GGGAAC GGT GC C AG GAT GAGT GTCCTGTTGG GAC CTAT GGCGTTCTCTGT 


960 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


901 


GGATACACAGGGGAAC GGT GCCAGGAT GAGT GT CCT GTT GGGACCT AT GGCGTT CT CT GT 


960 



Qy 961 GCT GAGACCT GC C AGT GT GT CAAC GGAGGGAAGT GT TACC ACGT GAG C G GC GC AT GC C T C 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 961 GCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTC 1020 

Qy 1021 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 108 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1021 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 108 0 

Qy 1081 GGCATCAAATGTGACAAACGGTGTCCCTGCCACTTGGAAAACACTCATAGCTGTCACCCC 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1081 GGCATCAAATGTGACAAACGGTGTCCCTGCCACTTGGAA7^ACACTCATAGCTGTCACCCC 1140 

Qy 1141 AT GT CT GGAGAGTGTGCCT GCAAGCCGGGCT GGT CAGGACT CT ACT GTAAT GAGACAT GT 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1141 AT GT CT GGAGAGT GT GCCT GCAAGCCGGGCT GGT CAGGACT CT ACT GTAAT GAGACAT GT 1200 

Qy 1201 TCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGAC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1201 TCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGAC 1260 

Qy 1261 T GT GACAGT GTGACT GGAAAGT GCACCT GT GCCCCAGGATTCAAAGGAATT GACT GCT CT 1320 

III I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 

Db 1261 TGTGACAGTGTGACTGGAAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCT 1320 

Qy 1321 ACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTA7^AAT 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1321 ACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAAT 1380 

Qy 1381 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 144 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1381 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 1440 

Qy 1441 GACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAG 1500 

I I I I II I I I I 1 1 1 1 1 I 1 1 1 1 I I I I 1 1 I I I I 1 1 1 1 1 I I 1 1 1 I I I I I I II I I I 1 1 i II I I I I 

Db 1441 GACT GCT C CAT CAGAT GT C C CAGT G GCAC AT GG GGCT T T GG CT GT AACT TAACAT G C CAG 1500 

Qy 1501 TGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 1501 TGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 1560 

Qy 1561 CGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAG 162 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1561 CGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAG 1620 

Qy 1621 CGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTC 168 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1621 CGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTC 1680 

Qy 1681 CCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAAC 17 4 0 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

Db 1681 CCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAAC 1740 

Qy 1741 TGCTCCCTGCCCTGCTACTGTAA7VAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGC 18 00 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1741 TGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGC 1800 



Qy 18 01 GAGT GT G CAC CAGGCT T C C GAGGCACC ACT T GT C AGAG GAT CTGCTCCCCTGGT T TT T AT 1860 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1801 GAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTAT 1860 

Qy 18 61 GGGCATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCAC 1920 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

Db 1861 GGGCATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCAC 1920 

Qy 1921 ATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGT 198 0 

> I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1921 ATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGT 1980 

Qy 1981 C C CAGT GG CAGATTT GG GAAAAACT GT GCAGGAAT TT GT AC CT GCAC CAACAAC GGAAC C 2 040 

I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 1981 CCCAGTGGCAGATTTGGGAAAAACTGTGCAGGAATTTGTACCTGCACCAACAACGGAACC 2 040 

Qy 2041 TGTAACCCCATTGACAGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCT 2100 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2041 T GT AAC C C CAT T GACAGAT CT T GT CAGT GTT AC C C CGGT T GGAT T GGCAGT GACT GCT CT 2100 

Qy 2101 CAAC CAT GT C CAC CT GC C CACT GG GGC C CAAACT GCAT CCACAC GT GCAACT GC C ATAAT 2160 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 21.01 CAAC C AT GTC CACCT GC C CACT GGGGC C CAAACT G CAT C CACAC GT GCAAC TGC C ATAAT 2160 

Qy 2161 GGAGCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTC 2220 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I | | | | I I I I I I I I I I I I I I I I I I I I I 

Db 2161 GGAGCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTC 2220 

Qy 2221 TACT GCAC TCAGAGAT GT C CT CT AGGGT T TTAT G GAAAAGAT T GT GCAC T GAT AT GCCAA 2280 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2221 TACT GC ACTCAGAGAT GT C CT C TAG G GT T TT AT GGAAAAGAT T GT G CACT GAT AT G C CAA 2280 

Qy 2281 TGTCAAAACGGAGCTGACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTC 2340 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2281 T GT CAAAAC GGAGCT GACT GC GAC CAC AT TT CT G G GCAGT GTACTT GC C GCACT GGATT C 2340 

Qy 2341 ATGGGACGGCACTGTGAGCAGAAGTGCCCTTCAGGAACATATGGCTATGGCTGTCGCCAG 2 4 00 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 2341 AT GG GAC GGCACT GT GAGCAGAAGT G C C CT T CAGGAAC ATAT GGCTAT GGCT GT C GCCAG 2 400 

Qy 2401 ATATGTGATTGTCTGAACAACTCCACCTGCGACCACATCACTGGGACCTGTTACTGCAGC 24 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 24 01 ATAT GT GAT T GT CT GAAC AAC T C CAC CT GC GAC C ACAT CACT GGGAC C T GT TACT G C AG C 2460 

Qy 2461 CCCGGATGGAAGGGAGCGAGATGTGATCAAGCTGGTGTTATCATAGTTGGAAATCTGAAC 2520 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I I I I I I I I I I I I I I I I I I I I I I I 

Db 24 61 C C C GGAT GGAAG GGAG CGAGAT GT GAT CAAGCT GGT GT TAT CAT AGTT G GAAAT CT GAAC 2520 

Qy 2521 AGCTTAAGCCGAACCAGTACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCA 2580 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2521 AGCTTAAGCCGAACCAGTACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCA 2580 

Qy 2581 GGCATCATCATTCTTGTCCTAGTTGTTCTCTTCCTACTGGCATTGTTCATTATTTATAGA 264 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2581 G GCAT CAT CAT T C TT GT C CT AGTT GT T C T CTT C CT ACT GGC AT T GT T CAT TAT TTAT AGA 264 0 

Qy 2641 CACAAGCAGAAGGGAAAGGAATCAAGCATGCCAGCAGTTACCTACACCCCTGCTATGAGG 2700 



2641 CACAAGC AGAAGG GAAAGGAAT CAAGCAT GC CAGCAGT TAC CT ACAC C C CT GCT AT GAGG 2700 

27 01 GTCGTCAATGCAGATTATACCATTTCAGGAACCCTTCCTCACAGCAATGGTGGAAACGCT 2760 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
27 01 GTCGTCAATGCAGATTATACCATTTCAGGAACCCTTCCTCACAGCAATGGTGGAAACGCT 2760 

2761 AAT AGC C ACT ACT T C AC C AAT C C C AGT TAC C AC AC GCT C AC C C AGT GT GC C AC AT C C C C T 2 820 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2761 AATAGCCACTACTTCACCAATCCCAGTTACCACACGCTCACCCAGTGTGCCACATCCCCT 2 820 

2 821 CAC GT CAACAACAGGGACAGGATGACT GT CAC GAAGT CAAAAAACAAT CAACT GT TT GT G 2880 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2821 C AC GT CAACAACAGGGACAGGAT GACT GTCAC GAAGT CAAAAAACAAT CAACTGT TT GT G 2 880 

2 881 AAT CTTAAAAATGT GAACCCT GGGAAGAGAGGCCCTGT GGGGGACTGCACT GGGACATTG 2940 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2881 AAT CTTAAAAATGT GAAC CCT GGGAAGAGAGGCCCTGT GGGGGACTGCACT GGGACATT G 294 0 

2941 CCGGCTGACTGGAAACATGGCGGCTACCTCAACGAGCTCGGTGCTTTTGGACTTGACAGA 3000 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2 941 CCGGCTGACTGGAAACATGGCGGCTACCTCAACGAGCTCGGTGCTTTTGGACTTGACAGA 3000 

3001 AGCT AT AT GGGAAAAT C CTT AAAAGAC C TG GGAAAGAATT CT GAAT AT AAT T CAAGT AAC 3060 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I M 

3001 AGCTAT AT GGGAAAAT C CTTAAAAGAC CTGGGAAAGAATT CT GAAT AT AAT T CAAGT AAC 3060 

3061 TGCTCCCTAAGCAGTTCTGAGAACCCATATGCCACTATTAAAGACCCACCTGTACTTATC 312 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

3061 T GCT CC CTAAGCAGTT CT GAGAAC C CAT AT G C C ACT AT TAAAGAC C C AC CT GT ACTT AT C 3120 

3121 CCGAAAAGCT CAGAGT GT GGTTAT GTGGAGATGAAAT CGCCGGCACGAAGAGATT CCCCA 3180 

I I I I I! I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I M I I I 
3121 CC GAAAAGCT CAGAGT GT GGT T AT GT GGAG AT GAAAT C GC C GGC AC GAAGAGAT T CC C C A 3180 

3181 TAT GC AGAGAT CAAT AACT CAACT T C AGC C AAC AGGAAT GT CT AT GAAGT T GAAC CT AC A 3240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3181 TAT GCAGAGAT CAAT AACT CAACT T CAGCCAAC AGGAAT GT CT AT GAAGTT GAAC CT AC A 324 0 

3241 GT GAGT GT T GT C C AAG G AGT AT T C AG CAAT AAT GGGCGTCTCT CC C AG GAT C CAT AT G AC 3300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
3241 GT GAGT GTTGT CCAAGGAGT ATT CAGCAAT AAT GGGCGT CT CT C CCAGGAT CCAT AT GAC 3300 

3301 CT C C C AAAGAAC AGT C ACAT C C CT T GT CAT TAT GACCT GCT GC C AGT C C GAG AC AGT T C A 3360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 
3301 CT C C C AAAGAAC AGT CAC AT C C C T T GT CAT TAT GAC CT GCT GC C AGT C C GAGAC AGT T C A 3360 

3361 TCCTCCCCTAAGCAAGAGGACAGTGGAGGTAGCAGCAGCAACAGCAGCAGCAGCAGTGAA 3420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

3361 T C CT CC C CT AAGCAAGAGGACAGT GGAGGT AGCAGCAG CAACAG CAGCAG CAGCAGT GAA 3420 

3421 TGA 3423 
I I I 

3421 TGA 3423 
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1 ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 60 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M M I I I I I I I I I I M I I I I I I I 

204 AT GGT TAT T T CT TT GAACT CAT G C CT GAGCT T T AT TTGT TT ATT GT TAT GC CACT GGAT T 263 

61 GGGACAGCATCACCT CTGAATCTT GAAGACCCTAATGT GTGT AGCCACT GGGAAAGCTAC 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I 

264 GGGACAGCAT C AC CT CT GAAT CTT GAAGAC C CTAAT GT GT GT AG C CACT GGGAAAGCTAC 323 

121 T C AGT GACT GT G CAAGAGT CAT AC C C AC AT C C CT T T GAT CAAAT T TACT AC AC GAGCT GC 180 
| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

324 T C AGT GACT GT G CAAGAGT CAT AC C C AC AT C C CT T T GAT C AAAT TT ACT AC AC GAG CT G C 383 

181 ACT GACATT CT AAACT GGTTT AAAT GCACGCGGCACAGAGT CAGCTAT C GGACAGCCTAT 24 0 

I III III Mil I I I I II i II III II I MIM I II I II I I I I I I I I I I I I I I I I I I 

384 ACT GACAT TCT AAACT GGTTTAAAT GCACGCGGCACAGAGT CAGCTAT CGGACAGCCT AT 4 43 



Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



241 C GACAT GGGGAGAAGACT AT GT AT AG GC G CAAGTCT CAGT GT T GT C CT G GAT T TT AT GAA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
444 C GACAT GGG GAGAAGACT AT GT ATAGGC GCAAGT CT CAGT GT T GT C CT G GAT T TT AT GAA 503 



301 AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCT 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

504 AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCT 

361 CCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGAT 

I I I I I I I I I I I | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

564 CCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGAT 

421 GGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

624 GGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGC 

4 81 AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

68 4 AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 

541 C GCT GTGAGCAGGGCACCTATGGTAACGACTGTCATCAGAGAT GCCAGT GCCAGAAT GGA 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
74 4 C GCT GT GAGCAG GGCAC CT AT GGTAAC GACT GT CAT CAGAGAT GCCAGT GC CAGAAT GGA 

601 GCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

804 GCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTC 

661 T GT GAG GAT CTTTGTCCTCCTGG T AAAC AT G G T C C AC AGT GT GAG CAGAGAT GCCCTTGT 

I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

864 T GT GAG GAT CT T T GT CCT CCT GGT AAACAT G GT C CAC AGT GT GAGC AGAGAT GCCCTTGT 
721 CAAAAT GGAGGAGT GTGT CAT CAC GT CACT G GAGAAT GCTCTTGCCCTTCTG GCT GGAT G 

I i 1 1 1 1 1 1 1 1 1 1 I I 1 1 1 I 1 1 1 1 I 1 1 I 1 1 1 I I I 1 1 1 I I I 1 1 1 1 1 I I I 1 1 1 1 I I 1 1 I I I I I I 

924 CAAAATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGATG 



360 



563 



420 



623 



480 



683 



540 



743 



600 



803 



660 



863 



720 



923 



780 



983 



840 



781 GGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAA 

1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 

984 GGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAA 1043 

841 T GC CAGT GC C AT AAT GGAG GGAC GT GT GAT G CT GC C AC AGGC CAAT GT CAT T G C AGT C CA 900 
I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I M 
104 4 T GC C AGT GCCAT AAT GGAGG GACGT GT GAT G CT GC C AC AGG C CAAT GT CATT GCAGT C C A 1103 



901 



960 



GGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGT 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1104 GGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGT 1163 

961 GCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTC 1020 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 
1164 G CT GAGAC CT GCCAGTGT GT CAAC GGAGGGAAGT GT T AC CAC GT GAGC GGC G CAT GCCT C 1223 

1021 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 

1224 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 1283 

1081 G GCAT CAAAT GT GACAAACG GT GT C C CT GC C ACT T GGAAAACACT C ATAGCT GT C ACC C C 1140 



Db 1284 GGCATCAAATGTGACAAACGGTGTCCCTGCCACTTGGAAAACACTCATAGCTGTCACCCC 1343 

Qy 1141 AT GT CT GGAGAGT GT GC CT GCAAGCCGGGCT GGT CAGGACT CTACT GTAAT GAGACAT GT 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1344 ATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGT 1403 

Qy 1201 TCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGAC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1404 TCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGAC 1463 

Qy 1261 T GT GACAGT GT GACTGGAAAGT GCACCT GT GCC CCAGGATT CAAAGGAATT GACT GCT CT 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 14 64 T GT GACAGTGT GACT GGAAAGT GCACCT GTGC CCCAGGATT CAAAGGAATT GACT GCT CT 1523 

Qy 1321 ACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAAT 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I II 

Db 1524 ACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAAT 158 3 

Qy 1381 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 144 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1584 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 1643 

Qy 1441 GACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAG 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1644 GACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAG 1703 

Qy 1501 TGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 1560 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1704 TGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 1763 

Qy 1561 CGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAG 1620 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 17 64 CGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAG 1823 

Qy 1621 CGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTC 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1824 CGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTC 1883 

Qy 1681 CCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAAC 1740 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1884 CCCGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAAC 1943 

Qy 1741 TGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGC 1800 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1944 TGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGC 2003 

Qy 1801 GAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTAT 18 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I 

Db 2004 GAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTAT 2 063 

Qy 1861 GGGCATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCAC 1920 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

Db 2064 GGGCATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCAC 2123 

Qy 1921 ATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGT 1980 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 2124 ATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGT 218 3 

Qy 1981 CCCAGTGGCAGATTTGGGAAAAACTGTGCAGGAATTTGTACCTGCACCAACAACGGAACC 204 0 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 2184 C C C AGT G G C AG AT T T G G G AAAAAC T GT G C AG G AAT T T GT AC C T G C AC C AAC AAC G G AAC C 2243 

Qy 2041 TGTAACCCCATTGACAGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCT 2100 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I II I I I I I 

Db 2244 TGTAACCCCATTGACAGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCT 2303 

Qy 2101 CAACCATGTCCACCTGCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAAT 2160 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I 
Db 2304 CAAC CAT GT CC ACCT GC C CACT GGG GCC CAAACT G CAT C CACAC GT GCAACT G CCAT AAT 2363 

Qy 2161 GGAGCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTC 2220 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I 

Db 2364 GGAGCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTC 2423 

Qy 2221 T ACTGCACT CAGAGAT GT CCT CT AGGGTTTTATGGAAAAGATT GT GCACT GAT AT GCCAA 2280 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2424 T ACT GCAC T CAGAGAT GT C CT CT AG GGT T TT AT GGAAAAGAT T GT GCACT GATAT GC CAA 2483 

Qy -22 81 TGTCAAAACGGAGCTGACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTC 234 0 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

Db 24 84 TGTCAAAACGGAGCTGACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTC 2543 

Qy 2341 AT GGGACGGCACTGT GAGCAGAAGT GCCCTTCAGGAACATAT GGCTATGGCT GTCGCCAG 2 4 00 

I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 2544 AT GGGACGGCACT GT GAGCAGAAGT GCC CTT CAGGAACAT AT GGCT AT GGCT GTCGCCAG 2603 

Qy 24 01 AT AT GT GATT GT CT GAAC AACT C C AC CT GC G AC CACAT CACT G GGAC CT GT T ACT GC AGC 24 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2604 AT AT GT GATTGTCTGAACAACT CCACCT GCG AC CACAT CACT GGGACCTGTTACTGCAGC 2663 

Qy 2461 C C C G GAT G G AAG G GAG C GAG AT GT GAT C AAG C T G G T GT TAT CAT AGT T G G AAAT C T GAAC 2520 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2664 CCCGGATGGAAGGGAGCGAGATGTGATCAAGCTGGTGTTATCATAGTTGGAAATCTGAAC 2723 

Qy 2521 AGCTTAAGCCGAACCAGTACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCA 2580 

I I 1 1 1 I I I I 1 1 I I I I 1 1 I I I I I I 1 1 I I i I 1 1 I 1 1 I I I 1 1 i 1 1 I I I I I I 1 1 I I I 1 1 I I I I I 

Db 2724 AGCTT AAGC CGAAC CAGT ACT GCT CT C C CT GCT GAT T C CTAC CAGAT CG G GG C CAT T GC A 27 83 

Qy 2581 GGC AT CAT CAT T CT T GT C CT AGT TGTTCTCTTC CT ACT G G CAT T GT T CAT TAT TT AT AGA 2 64 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2784 GGCATCATCATTCTTGTCCTAGTTGTTCTCTTCCTACTGGCATTGTTCATTATTTATAGA 2 843 

Qy 2641 C AC AAGCAGAAG GGAAAG GAAT C AAG CAT GC C AG CAGT T AC CTAC AC C C CT G CT AT GAG G 27 00 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2844 CACAAGCAGAAGGGAAAGGAAT CAAGCAT GC CAGCAGTTACCTACACCCCT GCTAT GAGG 2903 

Qy 2701 GTCGTCAATGCAGATTATACCATTTCAGGAACCCTTCCTCACAGCAATGGTGGAAACGCT 27 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

Db 2904 GT C GT CAAT GCAGAT TAT AC CAT TT CAGGAAC C CT T C CT CACAGC AAT GGT G GAAAC GCT 2963 



Qy 

Db 



2761 AAT AGCC ACTACTT CACCAAT C C CAGTT AC CACAC G CT CAC C CAGT GT GC CACAT C C C CT 2 82 0 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I II I I I I I I I II I I 
2964 AAT AG CC ACTACTT CACCAAT C C CAGTT AC CACAC GCT CAC C CAGT GT GC CACAT C C C CT 3023 



Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



2821 CAC GT CAAC AACAGGGACAGGAT GACT GT CAC GAAGT CAAAAAACAAT CAACT GTT T GT G 28 80 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3024 CAC GT C AAC AAC AG G GAC AG GAT GAC T GT CAC GAAGT CAAAAAACAAT CAACT GT T T GT G 3083 

28 81 AAT CTT AAAAAT GT GAAC C C T G G GAAGAG AG GCCCTGTGG GGGACT G C ACT GG GACAT T G 2 940 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

3084 AATCTTAAAAATGTGAACCCTGGGAAGAGAGGCCCTGTGGGGGACTGCACTGGGACATTG 314 3 

2941 CCGGCTGACTGGAAACATGGCGGCTACCTCAACGAGCTCGGTGCTTTTGGACTTGACAGA 3000 

I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3144 CCGGCTGACTGGAAACATGGCGGCTACCTCAACGAGCTCGGTGCTTTTGGACTTGACAGA 3203 

3001 AGCTATATGGGAAAATCCTTAAAAGACCTGGGAAAGAATTCTGAATATAATTCAAGTAAC 3060 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3204 AGCT AT AT GGGAAAAT CCT TAAAAGAC CT GGGAAAGAAT T CT GAAT ATAAT T CAAGT AAC 3263 

3061 T GCT C C CT AAGC AGT T CT GAGAAC C CAT AT G C C ACT AT T AAAGAC C C AC CT GT ACT TAT C 3120 

1 1 1 i I I 1 1 I 1 1 1 1 1 1 1 1 I I I I I I I I I I I I I 1 1 1 1 I I I I 1 1 I I I 1 1 1 1 I I I I 1 1 I I I 1 1 1 I 

3264 T G C T C C C T AAG C AGT T C T GAGAAC C CAT AT G C CAC TAT T AAAGAC C CAC C T GT AC T TAT C 3323 
3121 C C GAAAAGCT CAGAGT GT GGTT AT GT GGAGAT GAAAT C GC C GGCAC GAAGAGAT T C C CCA 3180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I HI 

3324 C C G AAAAG C T CAGAGT GT G GT TAT GT G GAG AT GAAAT C G C C G G CAC GAAGAGAT T C C C C A 3383 

3181 TAT GCAGAGAT CAATAACT CAACTTCAGCCAACAGGAAT GT CTAT GAAGTT GAACCT ACA 324 0 

I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
338 4 TAT GCAGAGAT CAATAACT CAACTTCAGCCAACAGGAAT GT CTATGAAGTT GAACCTACA 3443 

3241 GT GAGT GT T GT C C AAG G AGT AT T C AGC AAT AAT GGGCGTCTCTCC C AG GAT C CAT AT GAC 3300 

1 1 1 I 1 1 1 i I 1 1 1 1 1 I I I I I I I I I I I I I I I I 1 1 I I I I I 1 1 1 I I I 1 1 1 1 I I I I 1 1 I I I 1 1 1 1 

344 4 GT GAGT GT T GT C C AAG GAGT AT T C AG C AAT AAT GGGCGTCTCTCC C AG GAT C CAT AT GAC 3503 

3301 CTCCCAAAGAACAGTCACATCCCTTGTCATTATGACCTGCTGCCAGTCCGAGACAGTTCA 3360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

3504 CT C C CAAAGAAC AGT CAC AT C C C T T GT CAT TAT GAC C T G C T GC C AGT C C G AGAC AGT T C A 3563 

3361 T C CT C C CCTAAG CAAGAGGACAGT GGAGGT AG C AGC AGCAACAGCAG CAGCAGCAGT GAA 3420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I J I I I I I I I I I I I I I 

3564 T C CT C CCCTAAG CAAGAGGACAGT GGAGGTAG CAGCAGC AACAGCAG CAG CAGC AGT GAA 3623 

3421 TGA 3423 
I I I 

3624 TGA 3626 
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XX 
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XX 

DE Human disease related protein DNA sequence SeqID120. 
XX 

KW human; disease state; cytostatic; antiinflammatory; ophthalmological ; 



KW antiarteriosclerotic; vulnerary; gene therapy; 

KW hypoxia-regulated condition; tumourigenesis ; angiogenesis ; apoptosis; 

KW inflammation; erythropoiesis ; glycolysis; gluconeogenesis; 

KW glucose transportation; catecholamine synthesis; iron transport; 

KW nitric oxide synthesis; cancer; ischaemic condition; reperfusion injury; 

KW retinopathy; neonatal stress; pre-eclampsia; atherosclerosis; 

KW inflammatory condition; wound healing; gene; ds . 

XX 
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XX 
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XX 
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XX 
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XX 
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XX 
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XX 
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XX 
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XX 

PT New substantially purified polypeptide, useful for diagnosing or treating 

PT a hypoxia-regulated condition, such as cancer, ischemia, reperfusion 

PT injury, retinopathy, pre-eclampsia, atherosclerosis, inflammation, or 

PT wound healing. 
XX 

PS Claim 27; SEQ ID NO 120; 424pp; English. 
XX 

CC This invention relates to novel human genes and gene product which are 

CC implicated in certain disease states. Compounds which modulate the 

CC proteins of the invention may have cytostatic, antiinflammatory, 

CC ophthalmological, antiarteriosclerotic or vulnerary activities. The 

CC sequences of the invention may be useful for gene therapy. The invention 

CC may be useful for diagnosing or treating a hypoxia-regulated condition, 

CC such as tumourigenesis, angiogenesis, apoptosis, inflammation, 

CC erythropoiesis, or the biological response to hypoxia conditions 

CC including processes such as glycolysis, gluconeogenesis, glucose 

CC transportation, catecholamine synthesis, iron transport or nitric oxide 

CC synthesis. The disease includes cancer, ischaemic conditions, reperfusion 

CC injury, retinopathy, neonatal stress, pre-eclampsia, atherosclerosis, 

CC inflammatory conditions or wound healing. The present sequence is that of 

CC a disease related protein encoding DNA sequence of the invention. 

XX 

SQ Sequence 7522 BP; 2130 A; 1555 C; 1696 G; 2141 T; 0 U; 0 Other; 

Query Match 99.9%; Score 3419.8; DB 9; Length 7522; 
Best Local Similarity 99.9%; Pred. No. 0; 

Matches 3421; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 



Qy 

Db 



1 ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 60 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I 

204 ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 263 



Qy 61 GGGACAGCAT CACCT CT GAAT CT T GAAGACCCTAAT GTGT GTAGCCACT GGGAAAGCT AC 120 

I I I I I I I I I I I I I I t I I I I I I I II I I I I.I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2 64 GGGACAGCAT CACCT CT GAAT CTT GAAGACCCTAATGT GTGTAGCCACT GGGAAAGCTAC 323 

Qy 121 T C AGT GAC T GT G CAAGAGT CAT AC C C AC AT C C CT T T GAT C AAAT T T AC T AC AC GAG C T G C 180 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 324 T C AGT GAC T GT G CAAGAGT CAT AC C C AC AT C C C T T T GAT C AAAT T T AC T AC AC GAG CT G C 383 

Qy 181 ACT GAC AT T CTAAAC TGGTT TAAAT GC ACGC GGC ACAGAGT CAGCT AT C GGAC AGCCT AT 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 384 ACTGACATTCTAAACTGGTTTAAATGCACGCGGCACAGAGTCAGCTATCGGACAGCCTAT 4 43 

Qy 241 CGACATGGGGAGAAGACTATGTATAGGCGCAAGTCTCAGTGTTGTCCTGGATTTTATGAA 300 

I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I 
Db 444 C GAC ATGGGGAGAAGAC TAT GT AT AGGC GCAAGT CT CAGT GT T GT C CT GGAT TT T AT GAA 503 

Qy 301 AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCT 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 504 AGCGGGGAAAT GTGT GTCCCCCACTGTGCTGAT AAAT GTGTCCATGGTCGCTGTATTGCT 563 

Qy 361 C C AAACACCT GT CAGT GT GAGCCT GGC T GGG GAGG GAC CAACT G CT C CAGT GC CT GCGAT 420 

I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 564 CCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGAT 623 

Qy 421 GGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGC 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 624 GGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGC 683 

Qy 481 AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 684 AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 743 

Qy 541 CGCT GT GAGCAGGGCAC CT AT GGTAAC GACT GT CAT CAGAGAT GC CAGT GC CAGAAT GGA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 744 CGCTGTGAGCAGGGCACCTATGGT7^\CGACTGTCATCAGAGATGCCAGTGCCAGAATGGA 803 

Qy 601 GCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTC 660 

I I 1 1 1 1 II I I I I 1 1 1 1 1 I 1 1 I 1 1 I I I I 1 1 I 1 1 1 1 1 I 1 1 I I I I 1 1 1 1 1 I 1 1 I I I I i I 1 1 1 I 

Db 804 GCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTC 863 

Qy 661 T GT GAGGAT CTTTGTCCTCCTG GT AAAC AT GGT C C AC AGT GT GAGC AGAGAT G C C CT T GT 720 > 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 864 TGTGAGGATCTTTGTCCTCCTGGTAAACATGGTCCACAGTGTGAGCAGAGATGCCCTTGT 923 

Qy 721 CAAAAT GGAGGAGT GT GT CAT CAC GT C ACT GGAGAAT GCTCTTGCCCTT CT GGCT GGAT G 780 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 924 CAAAAT G GAGGAGT GT GT CAT CAC GT C ACT G GAGAAT GCTCTTGCCCTT CT G GCT GGAT G 983 

Qy 781 GGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAA 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 984 GGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAA 1043 



Qy 841 T GC CAGT GC C AT AAT G GAG GGAC GT GT GAT GCT G C CACAGGC CAAT GT CAT T GC AGT C CA 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1044 T G C CAGT GC CAT AAT GGAGGGAC GT GT GAT G CTGC CACAGGC CAAT GT CAT T GC AGT C CA 1103 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 
Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



901 



1104 



GGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGT 960 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGT 1163 



961 



G CT GAGAC CT GC CAGT GT GT CAAC GGAGGGAAGT GTT AC CAC GT GAGC GGC GC AT G CCT C 1020 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I! I I I I I I I I I I I I I I I I I I I I I I I I 
1164 GCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCCCATGCCTC 1223 

1021 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 1080 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 1283 



1224 



1140 



1081 GGCAT CAAAT GT GACAAAC GGT GT C C CT G C CACT T GGAAAACACT CAT AGCT GTCACC C C 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
12 84 G GC AT CAAAT GT GACAAAC GGT GT C C CT G C CAC T T GGAAAACACT C AT AG CT GT CAC C C C 1343 

1141 AT GT CT GGAGAGTGT GCCTGCAAGCCGGGCT GGT CAGGACT CT ACT GTAAT GAGACAT GT 1200 

I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
1344 ATGT CTGGAGAGTGT GCCTGCAAGCCGGGCT GGT CAGGACT CTACT GTAAT GAGACAT GT 14 03 

1201 TCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGAC 1260 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1404 .TCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGAC 1463 

1261 T GTGACAGTGT GACT GGAAAGTGCACCTGTGCCCCAGGATT CAAAGGAATT GACT GCT CT 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1464 T GT GACAGT GT GACT GGAAAGT GCACCT GT GCCCCAGGATTCAAAGGAATT GACT GCT CT 1523 

1321 ACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAAT 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1524 ACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTTWWVT 1583 

1381 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I II 

1584 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 1643 

1441 GACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAG 1500 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1644 GACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAG 1703 

1501 TGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 1560 

II I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

17 04 TGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 1763 

1561 CGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAG 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

17 64 CGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAG 1823 

1621 CGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTC 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
1824 CGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTC 1883 

1681 CCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAAC 174 0 

II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

18 84 CCCGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAAC 1943 

1741 TGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGC 1800 



Db 


1944 


Ov 


1801 


Db 


2004 


Ov 


1861 


Db 


2064 


Ov 


1921 


Db 


2124 


Ov 
vy 


1981 


Db 


2184 


Ov 


2041 


Db 


2244 


Ov 
s^y 


2101 


Db 


2304 


Ov 
^y 


2161 


Db 


2364 


Ov 


2221 


Db 


2424 


Ov 
wy 


2281 


Db 


2484 


Ov 
vy 


2341 


Db 


2544 


Ov 
vy 


2401 


Db 


2604 


Qv 


2461 


Db 


2664 


Qy 


2521 


Db 


2724 


Qy 


2581 



I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I li I II I I I 

TGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGC 



2003 



1860 



GAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTAT 

I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I 

GAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTAT 2 063 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I 



ATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGT 198 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGT 2183 

CCCAGTGGCAGATTTGGGAAAAACTGTGCAGGAATTTGTACCTGCACCAACAACGGAACC 2 040 

| | M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

C C CAGT GG CAGATTT GGGAAAAACT GT GCAGGAAT T T GT AC CT GC AC CAACAACGGAAC C 2243 

T GT AAC C C CAT T GAC AGAT CT T GT C AGT GT T AC C C C G GT T GGATT GGCAGT GACT GCT CT 2100 
| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I 
T GT AAC C C CAT T GAC AGAT CT T GT CAGT GTT AC C C C GGT T G GAT T G G CAGT GACT G CT CT 2303 

CAAC CAT GT C CAC CT G C C C ACT GGGGC C C AAACT G CAT C CAC AC GT GCAACT GC CAT AAT 2160 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CAACCATGTCCACCTGCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAAT 2363 



GGAGCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTC 

I | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I 

GGAGCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTC 



2220 



2423 



T ACTGCACT CAGAGATGT CCT CTAGGGT TTTAT GGAAAAGATT GT GCACT GAT AT GC CAA 2280 

1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

TACTGCACTCAGAGATGTCCTCTAGGGTTTTATGGAAAAGATTGTGCACTGATATGCCAA 2483 

TGTCAAAACGGAGCTGACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTC 234 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TGTCAAAACGGAGCTGACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTC 2543 

ATGGGACGGCACTGTGAGCAGAAGTGCCCTTCAGGAACATATGGCTATGGCTGTCGCCAG 2400 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 

ATGGGACGGCACTGTGAGCAGAAGTGCCCTTCAGGAACATATGGCTATGGCTGTCGCCAG 2603 

ATATGT GATT GT CT GAACAACT CCACCT GCGAC CACAT CACT GGGACCT GTTACT GCAGC 24 60 
| | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ATAT GT GATT GTCT GAACAACTCCACCT GCGAC CACAT CACTGGGAC CTGTTACTGCAGC 2663 

CC C GGAT GGAAGGGAGC GAGAT GT GAT C AAGCT GGT GT TAT CAT AGT T GGAAAT CT GAAC 2520 

| | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CC C GGAT GGAAGGGAGC GAGAT GT GAT C AAGCT G GT GT TAT CAT AGT T GGAAAT CT GAAC 2723 

AGCTTAAGCCGAACCAGTACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCA 2580 

| | | | | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AGCTTAAGCCGAACCAGTACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCA 278 3 



I M I M I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



2784 GGCATCATCATTCTTGTCCTAGTTGTTCTCTTCCTACTGGCATTGTTCATTATTTATAGA 284 3 

2 641 C AC AAG C AG AAG G G AAAG G AAT C AAG CAT G C C AG C AGT T AC C T AC AC C C C T G C TAT GAG G 2700 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
2844 CACAAGCAGAAGGGAAAGGAATCAAGCATGCCAGCAGTTACCTACACCCCTGCTATGAGG 2903 

2701 GT C GT C AAT GC AG AT TAT AC CAT T T C AGGAAC C CT T C CT C AC AGCAAT G GT GGAAAC GCT 2760 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
2 904 GT C GT CAAT GCAGAT TAT AC CAT T T C AGGAAC C CTT C CT CACAGCAATGGT GGAAAC GCT 2963 

2761 AATAGCCACTACTTCACCAATCCCAGTTACCACACGCTCACCCAGTGTGCCACATCCCCT 282 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2964 AATAGCCACTACTTCACCAATCCCAGTTACCACACGCTCACCCAGTGTGCCACATCCCCT 3023 

2821 C AC GT C AAC AAC AGG GACAGGAT GACT GT C AC GAAGT CAAAAAAC AAT C AACT GT T T GT G 288 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3024 CAC GT CAACAAC AGGGACAGGAT GACT GT CAC GAAGT C AAAAAACAAT CAACT GTT T GT G 308 3 

2 881 AATCTTAAAAATGTGAACCCTGGGAAGAGAGGCCCTGTGGGGGACTGCACTGGGACATTG 2940 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

308 4 AATCTTAAAAATGTGAACCCTGGGAAGAGAGGCCCTGTGGGGGACTGCACTGGGACATTG 314 3 

2941. CCGGCTGACTGGAAACATGGCGGCTACCTCAACGAGCTCGGTGCTTTTGGACTTGACAGA 3000 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

3144 CCGGCTGACTGGAAACATGGCGGCTACCTCAACGAGCTCGGTGCTTTTGGACTTGACAGA 3203 

3001 AGCT AT AT GG GAAAAT C CTTAAAAGAC CT GGGAAAGAAT T CT GAAT AT AAT T CAAGT AAC 3060 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3204 AGCT AT AT GGGAAAAT CCTTAAAAGAC CT GGGAAAGAAT T CT GAAT AT AAT T CAAGT AAC 3263 

3061 T G CT C C CT AAGC AGT T CT G AGAAC C CAT AT GC CACT AT T AAAGAC C CAC C T GT ACT TAT C 312 0 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I II I 
3264 T GCT C C CT AAGC AGT T CT GAGAAC C CAT AT GC CACT AT T AAAGAC C CAC C T GT ACT TAT C 3323 

3121 CCGAAAAGCT CAGAGT GTGGTTATGT GGAGAT GAAATCGCCGGCACGAAGAGATTCC CCA 318 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3324 C C GAAAAGCT CAGAGT GT GGT TAT GT GGAGAT GAAAT CGC CGG CAC GAAGAGAT T C C C CA 3383 

3181 T AT GCAGAGAT CAAT AACT CAACTTC AGC CAACAGGAAT GTCTAT GAAGT T GAAC CT AC A 3240 

II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I M I II II I I I I I I I I I I I 
3384 TAT G C AGAGAT CAAT AACT C AAC T T C AG C C AAC AG GAAT GT C TAT GAAGT T GAAC C T AC A 344 3 

3241 GT GAGT GTT GT C CAAGGAGTAT T CAGCAATAAT GG GC GT CTC T CC C AGGAT C CAT AT GAC 3300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
34 44 GT GAGT GT T GT C CAAG GAGT AT T CAGCAATAAT GG GC GT CT C T C C C AGGAT C CAT AT GAC 3503 

3301 C T C C C AAAG AAC AGT CAC AT C C C T T GT CAT TAT GAC C T G C T G C C AG T C C GAG AC AGT T C A 33 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3504 CT CCCAAAGAAC AGT CACAT C C CTT GT CAT TAT GACCT G C T GC CAGT CC GAGACAGT T CA 3563 

3361 T C CT C C CCT AAG CAAGAGGAC AGTGGAGGT AG CAGC AGCAAC AGCAGCAG CAGCAGT GAA 3420 
I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

3564 T C CT C CCCT AAG CAAGAGGAC AGT G GAGGTAGCAGCAGCAAC AGCAGCAG CAG CAGT GAA 3623 

3421 TGA 3423 
I I I 

3624 TGA 3626 



RESULT 4 
AAD46319 

ID AAD46319 standard; cDNA; 1761 BP. 
XX 

AC AAD46319; 
XX 

DT 27-JAN-2003 (first entry) 
XX 

DE Human EGF-family protein encoding cDNA #2. 
XX 

KW Human; EGF-family protein; novel human protein; NHP; drug discovery; 

KW restriction fragment length polymorphism analysis; forensic biology; 

KW toxicity; infectious disease; biological disorder; medical disorder; 

KW mental disorder; gene therapy; gene; ss. 
XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT CDS 1. .1761 

FT /*tag= a 

FT /pxoduct= "Human EGF-family protein #2" 
XX 

PN WO200272611-A2. 
XX 

PD 19-SEP-2002. 
XX 

PF 06-MAR-2002; 2002WO-US007477 . 
XX 

PR 12-MAR-2001; 2001US-0275013P . 
XX 

PA (LEXI-) LEXICON GENETICS INC. 
XX 

PI Yu X, Miranda M; 
XX 

DR WPI; 2002-723315/78. 

DR P-PSDB; AAE27986. 
XX 

PT New novel human nucleic acids useful for e.g. identifying protein coding 

PT sequences and mapping unique genes to a particular chromosome, as DNA 

PT markers for restriction fragment length polymorphism analysis, or in 

PT forensic biology. 
XX 

PS Disclosure; Page 40; 42pp; English. 
XX 

CC The present sequence is a cDNA encoding human EGF-family protein, a novel 

CC human protein (NHP) . The NHP sequences are useful for mapping unique 

CC genes to a particular chromosome; as DNA markers for restriction fragment 

CC length polymorphism analysis; in forensic biology; in defining and 

CC monitoring both drug action and toxicity; in identifying, selecting and 

CC validating novel molecular targets for drug discovery; in microarrays or 

CC other assay formats to screen collections of genetic material from 

CC patients who have a particular medical condition. The NHP peptides, 

CC fusion proteins, antibodies, antagonists and agonists can be used for 

CC detecting mutant NHPs or inappropriately expressed NHPs for the diagnosis 

CC of disease; for screening drugs for treatment of symptomatic or 



CC phenotypic manifestations of perturbing the normal function of NHP in the 

CC body and to treat diseases including infectious , mental, biological, or 

CC medical diseases or disorders. They are also used in gene therapy 
XX 

SQ Sequence 1761 BP; 370 A; 465 C; 525 G; 401 T; 0 U; 0 Other; 

Query Match 51.4%; Score 1760; DB 6; Length 1761; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 17 60; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I M 

Db 1 ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 60 

Qy 61 GGGACAGCATCACCT CT GAAT CTT GAAGACCCTAAT GT GT GT AGCCACT GGGAAAGCTAC 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 G GGACAG CAT C AC CT CT GAAT CT TGAAGAC C CTAAT GT GT GT AGCCACT GGGAAAGCTAC 12 0 

Qy 121 T CAGTGACT GT GCAAGAGT CAT ACCCACAT CCCTTT GAT CAAATTTACTACACGAGCT GC 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 T C AGT GAC T GT G C AAG AGT CAT AC C C AC AT CCCTTT GAT C AAAT T TACT AC AC GAG C T G C 180 

Qy 1 81 . ACTGACATTCTAAACTGGTTTAAATGCACGCGGCACAGAGTCAGCTATCGGACAGCCTAT 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 ACT GACAT T CT AAACT GGT T T AAAT G CAC GCGG CACAGAGT C AGCT AT C GGAC AGC CT AT 240 

Qy 241 CGACATGGGGAGAAGACTATGTATAGGCGCAAGTCTCAGTGTTGTCCTGGATTTTATGAA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 C GACAT GGG GAGAAGACT AT GT AT AGGC GCAAGTCT CAGT GT T GT C CT GGAT T T TAT GAA 300 

Qy 301 AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCT 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCT 360 

Qy 361 CCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGAT 42 0 

I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I II 

Db 361 CCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGAT 42 0 

Qy 421 GGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGC 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I 

Db 421 GGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGC 480 

Qy 481 AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 54 0 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I 

Db 481 AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 54 0 

Qy 541 CGCT GTGAGCAGGGCAC CT AT GGTAACGACT GT CAT CAGAGAT GC CAGT GCCAGAAT GGA 600 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 CGCT GTGAGCAGGGCACCTAT GGTAACGACT GT CAT CAGAGAT GCCAGT GCCAGAATGGA 600 

Qy 601 GCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTC 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 601 GCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTC 660 



Qy 

Db 



661 
661 



T GT GAGGAT CT T TGTCCTCCT GGT AAAC AT G GT C C ACAGT GT GAGCAGAGAT GC C CTT GT 
I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TGTGAGGATCTTTGTCCTCCTGGTAAACATGGTCCACAGTGTGAGCAGAGATGCCCTTGT 



720 
720 



Qy 721 CAAAATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGATG 780 

| | | | | | | | | | | | | | | | | | I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 721 C AAAATG GAGGAGT GT GT CAT CACGT CACT GGAGAAT GCT CTT GC C CTT CT GGCT GGAT G 780 

Qy 781 GGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAA 84 0 

III I | | | I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 781 GGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAA 84 0 

Qy 841 T GCCAGT GC CAT AAT GGAGGGAC GT GT GAT GCT GCCACAGGC CAAT GT C ATT G CAGT CCA 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 841 T GC CAGT GC CAT AAT GGAG GGAC GT GT GAT GCT GC CACAGGC CAAT GT CAT T GCAGT CC A 900 

Qy 901 GGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 901 GGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGT 960 

Qy 961 GCT GAGAC C T GC CAGT GT GT CAAC G GAGGGAAGT GT T AC CAC GT GAGCG G C GC AT GC CT C 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 961 GCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTC 1020 

Qy 1021 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 1021 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 108 0 

Qy 1081 GGCATCAAATGTGACAAACGGTGTCCCTGCCACTTGGAAAACACTCATAGCTGTCACCCC 1140 

I | | I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 
Db 1081 GGCAT CAAAT GT GACAAAC GGT GT C C CT GC CACT T G GAAAACACT C AT AGCT GT CAC CC C 1140 

Qy 1141 AT GTCT GGAGAGT GT GCCT GCAAGC C GGGCT GGT CAGGACT CT ACT GT AAT GAGACAT GT 12 00 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

Db 1141 ATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGT 12 00 

Qy 12 01 TCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGAC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1201 TCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGAC 1260 

Qy 12 61 T GT GACAGT GT GACT GGAAAGT GCAC CT GT GC CC CAGGATT CAAAG GAAT T GACT GCT CT 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1261 T GT GACAGT GT GACT GGAAAGT GCAC CT GT GC CC CAGGATT CAAAGGAAT T GACT GCT CT 1320 

Qy 1321 ACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAAT 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

Db 1321 ACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAAT 1380 

Qy 1381 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 14 40 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 1381 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 1440 

Qy 1441 GACT GCT C CAT CAGAT GT C CCAGT GG C ACAT GGGGCT T T GGCT GT AACT T AACAT G CCAG 1500 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1441 GACT G CT C CAT CAGAT GT C CCAGT G GCACAT G GG GCTTTGGCT GTAACTT AACAT GCC AG 1500 

Qy 1501 TGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 15 60 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 1501 TGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 1560 



1561 CGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAG 1620 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1561 CGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAG 1620 

Qy 1621 CGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTC 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I ! I I I 

Db 1621 CGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTC 1680 

Qy 1681 CCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAAC 1740 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1681 CCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAAC 1740 

Qy 1741 TGCTCCCTGCCCTGCTACTG 17 60 

I I I I I I I I I I I I I I I I I I I I 
Db 1741 TGCTCCCTGCCCTGCTACTG 1760 



Qy 

Db 



RESULT 5 


AAS72220 


ID 


AAS72220 standard; cDNA; 2909 BP. 


XX 




AC 


AAS72220; 


XX 




DT 


13-FEB-2002 (first entry) 


XX 




DE 


DNA encoding novel human diagnostic protein #8024. 


XX 




KW 


Human; chromosome mapping; gene mapping; gene therapy; forensic; 


KW 


food supplement; medical imaging; diagnostic; genetic disorder; ss. 


XX 




OS 


Homo sapiens. 


XX 




PN 


WO200175067-A2 . 


XX 




PD 


ll-OCT-2001. 


XX 




PF 


30-MAR-2001; 2001WO-US008631 . 


XX 




PR 


31-MAR-2000; 2000US-00540217 . 


PR 


23-AUG-2000; 2000US-0064 9167 . 


XX 




PA 


(HYSE-) HYSEQ INC. 


XX 




PI 


Drmanac RT, Liu C, Tang YT; 


XX 




DR 


WPI; 2001-639362/73. 


DR 


P-PSDB; ABG08033. 


XX 




PT 


New isolated polynucleotide and encoded polypeptides, useful in 


PT 


diagnostics, forensics, gene mapping, identification of mutations 


PT 


responsible for genetic disorders or other traits and to assess 


PT 


biodiversity. 


XX 




PS 


Claim 1; SEQ ID NO 8024; 103pp; English. 


XX 




CC 


The invention relates to isolated polynucleotide (I) and polypeptide (II) 


CC 


sequences. (I) is useful as hybridisation probes, polymerase chain 



CC reaction (PCR) primers,- oligomers, and for chromosome and gene mapping, 

CC and in recombinant production of (II) . The polynucleotides are also used 

CC in diagnostics as expressed sequence tags for identifying expressed 

CC genes. (I) is useful in gene therapy techniques to restore normal 

CC activity of (II) or to treat disease states involving (II) . (II) is 

CC useful for generating antibodies against it, detecting or quantitating a 

CC polypeptide in tissue, as molecular weight markers and as a food 

CC supplement. (II) and its binding partners are useful in medical imaging 

CC of sites expressing (II) . (I) and (II) are useful for treating disorders 

CC involving aberrant protein expression or biological activity. The 

CC polypeptide and polynucleotide sequences have applications in 

CC diagnostics, forensics, gene mapping, identification of mutations 

CC responsible for genetic disorders or other traits to assess biodiversity 

CC and to produce other types of data and products dependent on DNA and 

CC amino acid sequences. AAS64197-AAS94564 represent novel human diagnostic 

CC coding sequences of the invention. Note: The sequence data for this 

CC patent did not appear in the printed specification, but was obtained in 

CC electronic format directly from WIPO at 

CC ftp . wipo . int/pub/published__pct_sequences 

XX 

SQ Sequence 2909 BP; 688 A; 777 C; 760 G; 684 T; 0 U; 0 Other; 

Query Match 47.5%; Score 1625.4; DB 5; Length 2909; 

Best Local Similarity 80.2%; Pred. No. 0; 

Matches 2258; Conservative 0; Mismatches 36; Indels 521; Gaps 9; 

Qy 1130 GCTGTCACCCCATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTA 1189 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I 
Db 95 GCTGTCACCCCATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTA 154 

Qy 1190 AT GAGACAT GT T CT C CT G GAT T CT ACG GGGAAGCT T GC CAG C AGAT CT GC AGCT GC CAAA 1249 

1 1 1 i i M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 155 AT GAGACAT GT T CT C CT GGAT T CT ACG G GGAAGCT T GC CAG C AGAT CT GCAG CT GC CAAA 214 

Qy 1250 ATGGGGCAGACT GT GACAGT GT GACT GGAAAGT GCACCTGT GCCCCAGGATT CAAAGGAA 1309 

I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 215 AT G GGGCAGACT GT GACAGT GT GACTGGAAAGT GC ACCT GT G C C C CAGGAT T CAAAGGAA 274 

Qy 1310 TTGACTGCTCTACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTG 1369 

I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 275 TTGACTGCTCTACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTG 334 

Qy 137 0 GCTGTAAAAATGATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCT 1429 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 335 GCTGTAA7WVTGATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCT 394 

Qy 1430 GGCACGGGGTGGACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACT 1489 

I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I 
Db 395 GGCACGGGGTGGACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACT 4 54 

Qy 1490 TAACATGCCAGTGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTG 1549 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 455 TAACATGCCAGTGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTG 514 



Qy 1550 CAC CT GGAT GG CGC GGGGAGAAAT GCGAACT T CC C T GC CAG GAT GGCAC GT ACG GGC T GA 1609 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 

Db 515 CACCTGGATGGCGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGA 574 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1610 ACTGTGCTGAGCGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATT 1669 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 

575 ACTGTGCTGAGCGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATT 634 

1670 GCCGCTGCCTCCCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCT 1729 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

635 GCCGCTGCCTCCCCGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCT 694 

1730 GGGGCCCCAACTGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATG 1789 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
695 GGGGCCCCAACTGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATG 754 

1790 ATGGCATCTGCGAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCC 1849 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
755 ATGGCATCTGCGAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCC 814 

1850 CTGGTTTT TAT GG GCAT C GCT GCAGC C AGAC AT GC CCACAGT GC GT T C ACAG GAG C GGGC 1909 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

815 CTGGTTTTTATGGGCATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGC 874 

1910 CCTGCCACCACATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCA 1969 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M II I I I I I II I I I 
875 CCTGCCACCACATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCA 934 

1970 ATGA AGT GT GT C C C AGT GG C AGAT T T G G GAAAAAC T GT G C AG GAAT T T G 2018 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

935 AT GAAGCT TAT T CACAGT GT GT CC CAGT GG C AGAT TT GGGAAAAACT GT GCAGGAAT TT G 



994 



2 019 T AC C T GCAC CAACAAC GGAAC CT GT AAC CC C ATT GACAGAT CTT GT C AGT GT T AC C C C GG 2078 
I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I II I I I I I I II I I I I I I 
995 TACCT GC ACCAACAACGGAAC CT GT AAC C C CAT T GACAGAT C TT GT CAGT GT T AC CC C GG 1054 

2 079 TTGGATTGGCAGTGACTGCTCTCAACCATGTCCACCTGCCCACTGGGGCCCAAACTGCAT 2138 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1055 TTGGATTGGCAGTGACTGCTCTCAACCATGTCCACCTGCCCACTGGGGCCCAAACTGCAT 1114 

2139 CCACACGTGCAACTGCCATAATGGAGCTTTCTGCAGCGCCTACGATGGGGAATGTAAATG 2198 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1115 CCACACGTGCAACTGCCATAATGGAGCTTTCTGCAGCGCCTACGATGGGG7VATGTAAATG 1174 

2199 CACT CCT GGCT GGACAGGGCT CTACT GCACT CAGAGAT GT C C 2240 

I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I II 
1175 CACT CCT GGCT GGACAGGGCT CTACTGCACTCAGAAGATCCCCAAGACACT CTT GCAGGG 1234 

2241 T CT AGG GT T T TAT GGAAAAGATT GT G CACT GAT AT GC C AAT GT C AAAAC G 2290 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
1235 C AGCT GC C AG C CCATGGT T T TAT G GAAAAGATT GT GCACT GAT AT GC CAAT GT CAAAAC G 12 94 

2291 GAGCTGACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTCATGGGACGGC 2350 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1295 GAGCTGACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTCATGGGACGGC 1354 

2351 ACT GT GAGCAGA 2362 

I I I I I I I II I II 

1355 ACTGTGAGCAGAAGGTCAGGCCTCCCTGGGATCATCGCTGGTTACTCACTGCTCTGGGCG 1414 



1415 GAGGAGGTGTGACTACAAGAATGAAGACAGAATT CAAGTTTTC CATT CTGTTCT TTT GGG 1474 

2363 2362 

1475 CTCTCCCTTCCTCTCCTTCTTATTTTTGGAATGTGGCTGCTCAGAGCCTTAAAAGATCCA 1534 

2363 2362 

1535 GTCGAGCTTTCTTCATGGCAGAGGCAGAACCCGGGTCTCACATTGGAGGACAGTACATTC 1594 

2363 2362 

1595 GGTGGGGAGGAGGGCTGGTGGCCCAGGGCCAATCCCTGTTGCTGCCCTGTGCAGTCTGGA 1654 

2363 2362 

1655 CTGTGAGTGCCACCATGATTCCAGGAATGCTGTCCAGTTCTGGGACACTCTTGGGGGTAC 1714 

2363 2362 

1715 AAGT CAGCCT.CAATAGGAAT CCC C T GAAAGGACT CAG C.T.CT AGAT GT GCAGGGCT T G CAG 1774 

2363 2362 

1775 TCAGAGACAGCCTTGCTCCAAATTCCCAAGGTTGGAAAGCTACATTTGACTTTCCTTCTC 1834 
2363 AGTGCCCTTCAGGAACATATGGCTATGGCTGTCGCCAGATATGTGATTGTCTGAACA 2419 



1835 TAGAGT GCCCTT CAGGAACATAT GGCTAT GGCT GTC GCCAGATATGT GATT GT CT GAACA 1894 

242 0 ACT C C AC CT GC GAC CAC AT C ACT GGGACCT GT TACT GCAGC C C C GGAT GG AAGGGAG C GA 2479 

I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I 

1895 ACTCCACCTGCGACCACATCACTGGGACCTGTTACTGCAGCCCCGGATGGAAGGGAGCGA 1954 

2480 GAT GT GAT CAAGCT GGT GTTAT CAT AGTT GGAAAT CT GAACAGCTTAAGCCGAACCAGTA 2539 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1955 GAT GT GAT CAAG C T GGT GT TAT CAT AGTT GGAAAT C T GAACAGCTT AAG C C GAACC AGT A 2 014 

2540 CTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCAGGCATCATCATTCTTGTCC 2599 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I 

2015 CTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCAGGCATCATCATTCTTGTCC 2074 

2600 TAGTTGTTCTCTTCCTACTGGCATTGTTCATTATTTATAGACACAAGCAGAAGGGAAAGG 2659 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2075 TAGTT GTT CTCTT CCTACT GGCATTGT TCATTATTT ATAGACACAAGCAGAAGGGAAAGG 2134 

2660 AAT CAAGC AT GCCAGC AGT T AC CT AC AC CCC T GCT AT GAG G GT C GT C AAT GC AGAT TATA 2719 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2135 AATCAAGCATGCCAGCAGTTACCTACACCCCT GCTATGAGGGT CGT C AAT GC AG AT TATA 2194 

2720 C CAT T T CAGGAAC C CT TC C T CACAGCAAT G GT G GAAAC G CT AAT AGC CACT ACT T CAC CA 2779 

I I I I I I I I I I I I I I I I I I I IIMII I I I I I I I I I I Ill 

2195 C CAT T T CAGGAACCCT T C CT CACAG CAAT G GT GGAAAC G CT AAT AG C CACT ACTT CAC CA 2254 
2780 AT C C C AGT T AC CAC AC GCT CAC C C AGT GT GC C AC AT C C C CT CAC GT CAACAACAGGG 2836 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I III 

Db 2255 ATCCCAGTTACCACACGCTCACCCAGTGTGCCACATCCCCTTCACGTTCAACAACAGGGG 2314 

Qy 2837 AC AG GAT G ACT G - T C AC GAAGT C AAAAAAC AAT C AACT GT T T GT GAAT C T T AAAAAT GT G 2895 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2315 AC AG GAT GACT GT T C AC GAGT T C AAAAAAC AAT C AACT GT T T GT GAAT CT T AAAAAT GT G 2374 

Qy 2896 AACCCTGGGAAGAGAGGCCCTGT-GGGGGACTGCACTGGGACATTGCCGGCTGACTGGAA 2954 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2375 AACCCTGGGAAGAGAGGCCCTGTGGGGGGACTGCACTGGGACATTGCCGGCTGACTGGAA 24 34 

Qy 2955 ACATGGCGGCTACCTCAACGAGCT— CGGTGCTTTTGGACTTGACAGAAGCTATAT-GGG 3011 

I I I I I I I I II II I I II I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I II 

Db 2435 ACATGGCGGCTACCTCAACGAGCTTCGGTGCTTTTGGGACTTGACAGAAGCTATTTGGGG 24 94 

Qy 3012 AAAAT C CT T AAAAG AC C T GGGAAAGAATT CT GAAT AT AATT CAAGT AACT GCT C C CT 3068 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II 

Db 2495 AAAAT C CT T AAAAGGACCT G GGGAAAGGAT TT T T GAAT AT AAT T CAAGT AACT GCT C C CT 2554 

Qy 3069 AAGCAGT T CT GAGAAC C CAT AT GC CACT AT T AAAGAC C CAC CT GT ACTT AT C C CGAAAAG 3128 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2555 AAGCAGTT CTGAGAACCCATAT GCCACTAT TAAAGACCCACCT GTACTTATC CCGAAAAG 2614 

Qy 3129 CT C AGAGT GT G GT TAT GT G GAGAT GAAAT C G C C GGC AC GAAGAGATT C C C CAT AT G C AGA 3188 

I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I II I I I I I I I I II I I I I I I I I I I I 
Db 2615 CT CAGAGT GT G GT TAT GT G GAGAT GAAAT C GC C GGCACGAAGAGAT T C C C CAT AT GCAGA 2674 

Qy 3189 GAT C AAT AACT C AACT T CAGC C AACAG GAAT GT CT AT GAAGT T GAAC CT AC AGT GAGT GT 32 4 8 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2675 GATC AAT AACT CAACT T CAGC CAAC AG GAAT GT CT AT GAAGT T GAAC CT ACAGT GAGT GT 2734 

Qy 3249 TGTCCAAGGAGTATTCAGCAATAATGGGCGTCTCTCCCAGGATCCATATGACCTCCCAAA 3308 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2735 TGTCCAAGGAGTATTCAGCAATAATGGGCGTCTCTCCCAGGATCCATATGACCTCCCAAA 27 94 

Qy 3309 GAACAGT CAC AT C C CT T GT CAT TAT GAC CT GCT G C C AGT C C GAGAC AGT T CAT C CT C C C C 3368 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2795 GAACAGT CAC AT C C C T T GT CAT TAT GAC C T G C T G C C AGT C C GAG AC AG T T CAT C C T C C C C 28 54 

Qy 3369 TAAGCAAGAGGACAGT GGAGGTAGCAGCAGCAACAGCAGCAGCAGCAGT GAAT GA 3423 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

Db 2855 T AAG CAAGAGGACAGT GGT GGT AGCAG CAGCAAC AG CAG C AGC AGC AGT GAAT GA 2909 



RESULT 6 
AAX19959 

ID AAX19959 standard; DNA; 1448 BP. 
XX 

AC AAX19959; 
XX 

DT 15-JUN-1999 (first entry) 
XX 

DE Human Tango-83 5' -portion. 
XX 

KW Human; Tango-71; Tango-73; Tango-74; Tango-76; Tango-83; diagnosis; 

KW detection; ds . 

XX 



OS Homo sapiens . 
XX 

PN WO9907850-A1. 
XX 

PD 18-FEB-1999. 
XX 

PF 06-AUG-1998; 98WO-US016502 . 
XX 

PR 06-AUG-1997; 97US-0054 966P . 

PR 05-SEP-1997; 97US-0058 108P . 
XX 

PA (MILL-) MILLENNIUM BIOTHERAPEUTICS INC. 
XX 

PI Holtzman DA, Goodearl ADJ; 
XX 

DR WPI; 1999-167426/14. 
XX 

PT New TANGO polypeptides and nucleic acids encoding them - useful as 

PT diagnostic agents and for treating disorders caused by aberrant 

PT expression of TANGO. 
XX 

PS Claim 1; Fig 7; 84pp; English. 
XX 

CC The present sequence is a 5 1 -portion of Tango-83. Tango polypeptides are 

CC useful for identifying compounds which bind the polypeptide via direct 

CC binding, competition binding assays or Tango-71, -73, -74, 76 or -83- 

CC mediated signal transduction. Tango polypeptides are also useful for 

CC identifying modulating compounds by determining effect on Tango activity. 

CC Tango polypeptides and nucleic acids are useful for diagnosing diseases 

CC related to aberrant expression of Tango, and Tango polypeptides are 

CC useful for raising antibodies which can be used in diagnostic assays for 

CC detection of Tango, and also for generating anti-idiotype antibodies for 

CC prevention and protection 

XX 

SQ Sequence 1448 BP; 308 A; 407 C; 400 G; 333 T; 0 U; 0 Other; 



Query Match 41.7%; Score 1425.8; DB 2; Length 1448; 

Best Local Similarity 99.9%; Pred. No. 0; 

Matches 1427; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1216 GGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGACTGTGACAGTGTGACT 1275 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 18 GGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGACTGTGACAGTGTGACT 77 



Qy 1276 GGAAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCTACCCCATGCCCTCTG 1335 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I 

Db 7 8 GGAAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCTACCCCATGCCCTCTG 137 

Qy 1336 GGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCT 1395 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 138 GGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCT 197 



Qy 1396 CCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGA 1455 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 198 CCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGA 257 

Qy 1456 TGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGGGA 1515 



1 1 II 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 

Db 258 TGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGGGA 317 

Qy 1516 GCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAATGC 1575 

| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 318 GCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAATGC 377 

Qy 1576 GAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGC 1635 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

D b 378 GAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGC 437 

Qy 1636 CACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCGGGATGGTCAGGT 1695 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 438 CACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCCGGATGGTCAGGT 4 97 

Qy 1696 GTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGC 1755 

| | | I I I I I I I I I I I I I I I I II II I I I I I I I I I I I II I I I I I I I I I I I I II II I I I I I I I I 
Db 498 GTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGC 557 

Qy 1756 TACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCACCAGGC 1815 

| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 558 TACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCACCAGGC 617 

Qy 1816 TTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGC 1875 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 618 TTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGC 677 

Qy 187 6 CAGAC AT G CC CAC AGT GC GT T C ACAG C AGC G GGC C CT G C C AC C AC AT C AC C GGC CT GT GT 1935 

| | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I II I 
Db 678 C AGACAT GC CCACAGT GC GT T CACAGC AG CGGG C C CT GC CAC CAC AT CAC CGGC CT GT GT 737 

Qy 1936 GACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCAGATTT 1995 

| | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 7 38 GACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCAGATTT 797 

Qy 1996 GGGAAAAACTGTGCAGGAATTTGTACCTGCACCAACAACGGAACCTGTAACCCCATTGAC 2055 

| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 7 98 GGGAAAAACTGT GCAGGAATTT GTACCT GCACCAACAACGGAACCT GTAACCCCATTGAC 857 

Qy 2056 AGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCTCAACCATGTCCACCT 2115 

| |.| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 
Db 858 AGAT CT T GTCAGT GT TAC C CC GGTT GGAT T G GCAGT GACT GCT CT CAAC CAT GT C C AC CT 917 

Qy 2116 GCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCTGCAGC 2175 

I I I I I I I II I I I I I I I I I I I I I II I I II I I I I I II I I I I I I I I I I I I I I I I I I I 

Db 918 GCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCTGCAGC 977 

Qy 2176 GCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTCTACTGCACTCAGAGA 2235 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 978 GCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTCTACTGCACTCAGAGA 1037 

Qy 2236 T GT C CT CT AGGGTT T TAT GGAAAAGATT GT G C ACT GAT AT GC CAAT GT CAAAACGGAG C T 2295 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1038 TGT C CT CTAGG GTT T TAT GGAAAAGAT T GTGC AC T GAT AT G C CAAT GT CAAAAC GGAG CT 1097 

Qy 2296 GACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTCATGGGACGGCACTGT 2355 

I I I I I I I I I I I I I I I I II I I I M I I I I I I I I I I I I I I I I I I MM 



Db 



1098 GACT GC GAC CACAT TT CT GGGCAGT GT ACT T GC C GCACTG GATT CAT GGGAC GG CAC T GT 1157 



Qy 


2356 


Db 


1158 


Qy 


2416 


Db 


1218 


Qy 


2476 


Db 


1278 


Qy 


2536 


Db 


1338 


Qy 


2596 


Db 


1398 



GAGCAGAAGT GCCCTT CAGGAACATAT GGCTATGGCT GTCGCCAGATATGT GATT GT CTG 2415 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GAGCAGAAGT GCC CTT CAGGAACATAT GGCT AT GGCT GT CGCCAGAT AT GT GATT GT CTG 1217 

AACAACT CCACCT GCGAC CACAT CACT GGGACCT GTT ACT GCAGCCCCGGAT GGAAGGGA 2475 
I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
AACAACT CCACCT GCGAC CACAT CACT GGGACCT GTTACTGCAGCCCCGGATGGAAGGGA 1277 

GCGAGATGTGATCAAGCTGGTGTTATCATAGTTGGAAATCTGAACAGCTTAAGCCGAACC 2535 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I M I I I I I I I I I I I I I II I I I 

GC GAGAT GT GAT CAAGCT G GT GT TAT C AT AGT T GGAAATC T GAACAG CTT AAGC C GAAC C 1337 

AGTACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCAGGCATCATCATTCTT 2595 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AGTACTGCTCTCCCTGCTGATTCCTACCAAATCGGGGCCATTGCAGGCATCATCATTCTT 1397 

GT C CT AGT T GTT C T CT T C CT ACT GGC ATT GT T CAT TAT T T AT AGAC AC A 2644 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GTCCTAGTTGTTCTCTTCCTACTGGCATTGTTCATTATTTATAGACACA 1446 



RESULT 7 
ABZ36212 

ID ABZ36212 standard; cDNA; 1273 BP. 
XX 

AC ABZ36212; 
XX 

DT 10-FEB-2003 (first entry) 
XX 

DE Human secretory polynucleotide SPTM SEQ ID NO 376. 
XX 

KW Human; SPTM; autoimmune disorder; inflammatory disorder; AIDS; anaemia; 

KW asthma; Crohn's disease; neurological disorder; epilepsy; cancer; 

KW Huntington's disease; Alzheimer's disease; Creutzfeldt- Jakob disease; 

KW multiple sclerosis; Parkinson's disease; cell proliferative disorder; 

KW anti-inflammatory; immunosuppressive; neuroprotective; nootropic; 

KW neuroleptic; anticonvulsant; cytostatic; antiparkinsonian; anxiolytic; 

KW antipsoriatic; antianaemic; anti-HIV; human immunodeficiency virus; 

KW secretory polynucleotide; secretory protein; gene; ss. 

XX 

OS Homo sapiens. 
XX 

PN WO200283876-A2. 
XX 

PD 24-OCT-2002. 
XX 

PF 27-MAR-2002; 2002WO-US009921 . 
XX 

PR 29-MAR-2001; 2001US-0280067P . 

PR 29-MAR-2001; 2001US-0280068P . 

PR 16-MAY-2001; 2001US-02912 80P . 

PR 17-MAY-2001; 2001US-0291829P . 

PR 17-MAY-2001; 2001US-02 91849P . 

PR 19-JUN-2001; 2001US-0299428P . 

PR 20-JUN-2001; 2001US-0299776P . 



PR 20-JUN-2001; 200 1US-0300001P . 
XX 

PA (INCY-) INCYTE GENOMICS INC. 
XX 

PI Daffo A, Jones AL, Tran AB, Dahl CR, Gietzen D, Chinn J; 

PI Dufour GE, Hillman JL, Yu JY, Tuason O, Yap PE, Amshey SR; 

PI Daughtery SC, Dam TC, Liu TF, Nguyen DA, Kleefeld Y, Gerstin EH; 

PI Peralta CH, David MH, Lewis SA, Chen AJ, Panzer SR, Harris B; 

PI Flores V, Marwaha R, Lo A, Lan RY, Urashka ME; 

XX 

DR WPI; 2003-075543/07. 

DR P-PSDB; ABP75770. 
XX 

PT New human secretory proteins and polynucleotides, useful for diagnosing, 

PT treating or preventing autoimmune/inflammatory disorders (e.g. AIDS), 

PT neurological disorders (e.g. Alzheimer's), or cell proliferations or 

PT cancers . 
XX 

PS Claim 1; SEQ ID NO 376; 458pp + Sequence Listing; English. 
XX 

CC The invention relates to a secretory polynucleotide (designated sptm) 

CC comprising any of 567 polynucleotide sequences (ABZ35837-ABZ36403 ) , a 

CC naturally occurring polynucleotide sequence at least 90 % identical to 

CC the polynucleotide sequence, a polynucleotide complementary to them or an 

CC RNA equivalent of them. The polypeptide or polynucleotide are useful for 

CC treating, preventing or diagnosing a disease or condition associated with 

CC the expression of functional SPTM. These are particularly useful for 

CC diagnosing, treating or preventing autoimmune/inflammatory disorders 

CC (e.g. acquired immunodeficiency syndrome, anaemia, asthma or Crohn's 

CC disease), neurological disorders (e.g. epilepsy, Huntington f s disease, 

CC dementia, stroke, Alzheimer f s disease, Creutzfeldt- Jakob disease, 

CC multiple sclerosis, cerebral palsy, Parkinson's disease, anxiety, 

CC schizophrenia or amnesia), or cell proliferative disorders (e.g. 

CC psoriasis, polycythemia vera, or cancers including adenocarcinoma, 

CC leukaemia, lymphoma, melanoma, myeloma, sarcoma or cancers of the brain, 

CC breast, cervix or prostate). Note: The sequence data for this patent did 

CC not form part of the printed specf ication, but was obtained in electronic 

CC format directly from WIPO at f tp . wipo . int/pub/published pct__sequences 

XX 

SQ Sequence 1273 BP; 356 A; 311 C; 297 G; 309 T; 0 U; 0 Other; 

Query Match 30.7%; Score 1049.4; DB 7; Length 1273; 

Best Local Similarity 99.9%; Pred. No. 3.9e-294; 

Matches 1050; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 2373 AGGAACAT AT GGC T AT GGCT GT C GC CAGAT AT GT GATT GT CT GAACAACT C CAC CT GC GA 2432 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 1 AGGAACAT AT GGCT AT GGCT GT C G C CAGAT AT GT GAT T GT CT GAAC GAC T C CAC CT GC GA 60 

Qy 2 433 CCACATCACT GGGACCTGTTACTGCAGCCCCGGAT GGAAGGGAGCGAGATGTGATCAAGC 2492 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 CCACATCACTGGGACCTGTTACTGCAGCCCCGGATGGAAGGGAGCGAGATGTGATCAAGC 120 



QY 
Db 



2493 
121 



TGGTGTTATCATAGTTGGAAATCTGAACAGCTTAAGCCGAACCAGTACTGCTCTCCCTGC 

I | | || I I I | I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I II I 
TGGTGTTATCATAGTTGGAAATCTGAACAGCTTAAGCC GAAC CAGT ACT GCTCTCCCTGC 



2552 
180 



Qy 


2553 


Db 


181 


Qy 


2613 


Db 


241 


Qy 


2673 


Db 


301 


Qy 


2733 


Db 


361 


Qy 


2793 


Db 


421 


Qy 


2853 


Db 


481 


Qy 


2913 


Db 


541 


Qy 


2973 


Db 


601 


Qy 


3033 


Db 


661 


Qy 


3093 


Db 


721 


Qy 


3153 


Db 


781 


Qy 


3213 


' Db 


841 


Qy 


3273 


Db 


901 


Qy 


3333 


Db 


961 


Qy 


3393 



T GAT T C CT ACCAGAT CG GGGC CAT T GC AGGCAT CAT C AT TC T T GT CCT AGT T GTT CT CTT 2612 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TGATTCCTACCAGATCGGGGCCATTGCAGGCATCATCATTCTTGTCCTAGTTGTTCTCTT 240 

CCT ACT GGCAT T GT T CAT TAT T T ATAG ACACAAG CAGAAGGGAAAGGAAT CAAGC AT G C C 2672 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I M I I I 

C CT ACT GGCATT GT T CAT T AT TT AT AGAC ACAAGCAGAAG G GAAAGGAAT CAAG CAT GC C 300 

AGC AGT T AC CT AC AC C C CT G CT AT GAGGGT C GT C AAT GC AGATT AT AC CAT T T C AGGAAC 2732 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AGCAGT T ACCTAC AC CC CT GCT AT GAGGGT C GT CAAT G CAGATT AT AC CAT T T C AGGAAC 360 

CCTTCCTCACAGCAATGGTGGT^AACGCTAATAGCCACTACTTCACCAATCCCAGTTACCA 2792 
| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
CCT T CCT CACAGCAAT GGT G GAAAC G C TAAT AG C CACTACTT CAC CAAT C C CAGT T AC CA 420 

CAC GCT C ACC C AGT GT G CCAC AT C C C CT CAC GT CAACAACAGGGACAGGAT GACT GT CAC 2852 
| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CAC GCT CAC C CAGT GT G C CAC AT C C C C T CAC G T C AAC AAC AG G G AC AG GAT G AC T GT CAC 480 

GAAGT CAAAAAACAAT CAACT GTT T GT GAAT CT T AAAAAT GT GAACC CT GG GAAGAGAGG 2912 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I 

GAAGT CAAAAAACAATCAACT GT T T GT GAAT CT T AAAAAT GT GAACC CT GGGAAGAGAGG 540 

CCCTGTGGGGGACTGCACTGGGACATTGCCGGCTGACTGGAAACATGGCGGCTACCTCAA 2972 

I I I I I I I i I I I I I I I I I II I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 1 I I I I I I I I I 

CCCTGTGGGGGACTGCACTGGGACATTGCCGGCTGACTGGAAACATGGCGGCTACCTCAA 600 

CGAGCTCGGTGCTTTTGGACTTGACAGAAGCTATATGGGAAAATCCTTAAAAGACCTGGG 3032 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I 

CGAGCTCGGTGCTTTTGGACTTGACAGAAGCTATATGGGAAAATCCTTAAAAGACCTGGG 660 

AAAGAAT T C T GAAT AT AAT T C AAGT AAC T G C T C C C T AAG CAGT T C T G AGAAC C CAT AT G C 3092 

I || | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

AAAGAATTCT GAATATAATT CAAGTAACT GCTCCCTAAGCAGTT CT GAGAACCCAT AT GC 720 

CACTATTAAAGACCCACCT GT ACTTAT CCCGAAAAGCT CAGAGT GT GGTT AT GT GGAGAT 3152 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
C ACT AT T AAAGAC CCACCT GTACT T AT C C C GAAAAGCT CAGAGT GT GGTT AT GT G GAGAT 780 

GAAAT CGCCGGCACGAAGAGATT CCCCATATGCAGAGATCAATAACT CAACTT CAGCCAA 3212 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GAAAT CGCCGGCACGAAGAGATT CCCCATATGCAGAGAT CAAT AACTCAACTT CAGCCAA 84 0 

CAGGAAT GT CT AT GAAGT T GAAC CT AC AGT GAGT GTT GT C CAAGGAGT AT T C AGCAAT AA 3272 
| | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
C AG GAAT GT CT AT GAAGT T GAAC CT ACAGT GAGT GT T GT C CAAGGAGT AT T C AGC AAT AA 900 

TGGGCGTCTCTCC C AGGAT C CAT AT GAC CT C C C AAAGAACAGT CAC AT C C CTT GT CAT T A 3332 
| | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TGGGCGTCTCTCC C AGGAT C CAT AT GAC CT CC CAAAGAACAGT CAC AT C C CT T GT C ATT A 960 

T GAC C T G C T G C CAGT C C GAG AC AG T T CAT C C T C C C C T AAG CAAG AG GAC AGT G GAG GT AG 3392 
| | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TGACCTGCTGCCAGTCCGAGACAGTTCATCCTCCCCTAAGCAAGAGGACAGTGGAGGTAG 1020 

3393 CAGCAGCAACAGCAGCAGCAGCAGTGAATGA 3423 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 1021 CAGCAGCAACAGCAGCAGCAGCAGTGAATGA 1051 



RESULT 8 
AAS91826 

ID AAS91826 standard; cDNA; 1074 BP. 
XX 

AC AAS91826; 
XX 

DT 13-FEB-2002 (first entry) 
XX 

DE DNA encoding novel human diagnostic protein #27630. 
XX 

KW Human; chromosome mapping; gene mapping; gene therapy; forensic; 

KW food supplement; medical imaging; diagnostic; genetic disorder; ss. 
XX 

OS Homo sapiens. 
XX 

PN WO200175067-A2. 
XX 

PD ll-OCT-2001. 
XX 

PF 30-MAR-2001; 2001WO-US008631 . 
XX 

PR 31-MAR-2000; 2000US-0054 0217 . 

PR 23-AUG-2000; 2000US-0064 9167 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Drmanac RT, Liu C, Tang YT; 
XX 

DR WPI; 2001-639362/73. 

DR P-PSDB; ABG27639. 
XX 

PT New isolated polynucleotide and encoded polypeptides , useful in 

PT diagnostics, forensics, gene mapping, identification of mutations 

PT responsible for genetic disorders or other traits and to assess 

PT biodiversity. 
XX 

PS Claim 1; SEQ ID NO 27630; 103pp; English. 
XX 

CC The invention relates to isolated polynucleotide (I) and polypeptide (II) 

CC sequences. (I) is useful as hybridisation probes, polymerase chain 

CC reaction (PCR) primers, oligomers, and for chromosome and gene mapping, 

CC and in recombinant production of (II). The polynucleotides are also used 

CC in diagnostics as expressed sequence tags for identifying expressed 

CC genes. (I) is useful in gene therapy techniques to restore normal 

CC activity of (II) or to treat disease states involving (II). (II) is 

CC useful for generating antibodies against it, detecting or quantitating a 

CC polypeptide in tissue, as molecular weight markers and as a food 

CC supplement. (II) and its binding partners are useful in medical imaging 

CC of sites expressing (II) . (I) and (II) are useful for treating disorders 

CC involving aberrant protein expression or biological activity. The 

CC polypeptide and polynucleotide sequences have applications in 

CC diagnostics, forensics, gene mapping, identification of mutations 

CC responsible for genetic disorders or other traits to assess biodiversity 



CC and to produce other types of data and products dependent on DNA and 

CC amino acid sequences. AAS64197-AAS94564 represent novel human diagnostic 

CC coding sequences of the invention. Note: The sequence data for this 

CC patent did not appear in the printed specif ication, but was obtained in 

CC electronic format directly from WIPO at 

CC f tp. wipo . int/pub/published_pct_sequences 

XX 

SQ Sequence 1074 BP; 241 A; 286 C; 303 G; 244 T; 0 U; 0 Other; 

Query Match 18.1%; Score 618.4; DB 5; Length 1074; 

Best Local Similarity 90.0%; Pred. No. 9.5e-169; 

Matches 718; Conservative 0; Mismatches 1; Indels 79; Gaps 2; 

Qy 412 GCCTGCGATGGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAA7UVTGGG 471 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 135 GCCTGCGATGGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGG . 194 

Qy 472 GCTCTGTGCAACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGC 531 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 195 GCTCTGTGCAACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGC 254 

Qy 532 T GCGAGGACCGCT GT GAGCAGGGCAC CT AT GGTAAC GACTGT CAT CAGAGAT GCCAGT GC 591 

I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 255 T GC GAGGACC GCT GT GAG CAGGG C AC CT AT GGTAAC GACTGT CAT CAGAGAT GC CAGT GC 314 

Qy 592 CAGAATGGAGCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACC 651 

I I II I 1 I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 315 CAGAATGGAGCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACC 374 

Qy 652 GGAGCCTTCTGTGAGGATCTTTGTCCTCCTGGTAAACATGGTCCACAGTGTGAGCAGAGA 711 

I I 1 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I i I I I I I I I I I I I I I I I I I 

Db 375 GGAGCCTTCTGTGAGGATCTTTGTCCTCCTGGTAAACATGGTCCACAGTGTGAGCAGAGA 4 34 

Qy 712 TGCCCTTGTCAAAATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCT 771 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 435 TGCCCTTGTCAAAATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCT 4 94 

Qy 772 GGCTGGAT 779 

I I I I I I I I 

Db 4 95 GGCTGGATGTTGTCTTTCCCTGGCTGGAGGCCCATCTAATTTTCCAAGTCTCTTTGAATG 554 

Qy 780 — GGGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAA 837 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 555 CAGGGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAA 614 

Qy 838 GAAT GC C AGT GCCATAAT GGAGGGACGT GT GAT GCT GCCACAGGC CAAT GT CATT GCAGT 8 97 

I I I I I I I I I I I I I I I I I I I II I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 615 GAAT GC CAGT GC CAT AAT GGAGGGAC GT GT GAT GCT GC CACAGGC CAAT GT CAT T GCAGT 674 

Qy 898 C CAGGAT ACACAGG GGAAC G GT G C CAGGAT GAGT G 932 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 675 CCAGGAT ACACAGG GGAAC GAGCAGC AGTT C CGGAT GTT AGAAAG GT GC CAGGAT GAGT G 734 

Qy 933 T C CT GT T GGGACCT AT GGCGT T CT CT GT GCT GAGAC CT GC CAGT GT GT CAAC GGAGGGAA 992 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 735 TCCTGTTGGGACCTATGGCGTTCTCTGTGCTGAGACCTGCCAGTGTGTCAACGGAGGGAA 7 94 



Qy 

Db 



993 GTGTTACCACGTGAGCGGCGCATGCCTCTGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGA 1052 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

795 GTGTTACCACGTGAGCGGCGCATGCCTCTGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGA 854 



Qy 



Db 



1053 AGCACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAATGTGACAAACGGTGTCCCTGCCA 1112 

I II I I I I I I I I I I I I I I I I II I I II II I I I I I I I I I I I I I I I I I I I MINIMI 

855 AGCACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAATGTGACAAACGGTGTCCCTGCCA 914 



Qy 



Db 



1113 CT T GGAAAACACT CAT AG 1130 
I I I I I I I I I I I I I I I I I 
915 CCT GGAAAACACT CAT AG 932 



RESULT 9 
AAF27791 

ID AAF27791 standard; cDNA; 3567 BP. 
XX 

AC AAF27791; 
XX 

DT 05-APR-2001 (first entry) 
XX 

DE Rat TANGO 272 coding sequence SEQ ID NO: 19. 
XX 

KW Membrane associated protein; secreted protein; human; mouse; rat; 

KW INTERCEPT 340; MANGO 003; MANGO 347; TANGO 272; TANGO 295; TANGO 354; 

KW TANGO 378; skeletal disorder; cardiovascular disorder; renal disorder; 

KW haematopoietic disorder; neural disorder; hepatic disorder; 

KW neoplastic disease; ss. 

XX 

OS Rattus sp. 
XX 

PN WO200100673-A1. 
XX 

PD 04-JAN-2001. 
XX 

PF 29-JUN-2000; 2000WO-US018198 . 
XX 

PR 30-JUN-1999; 99US-003454 64 . 
XX 

PA (MILL-) MILLENNIUM PHARM INC. 
XX 

PI Barnes TM, Fraser CC, Wrighton N, Myers P, Busfield SJ, Sharp JD; 
XX 

DR WPI; 2001-050128/06. 

DR P-PSDB; AAB66269. 
XX 

PT Isolated secreted or transmembrane proteins are used for diagnosis and 

PT treatment of neoplastic and hematopoietic disorders e.g. T cell 

PT disorders, cancer and tumors. 
XX 

PS Claim 1; Page 235-238; 294pp; English. 
XX 

CC The present invention provides the protein and coding sequences for a 

CC number of membrane associated and secreted proteins from human, mouse and 

CC rat. The proteins are designated INTERCEPT 340, MANGO 003, MANGO 347, 

CC TANGO 272, TANGO 295, TANGO 254 and TANGO 378. The proteins are all 

CC involved in signal transduction and the sequences can be used in the 



CC treatment of cardiovascular, renal, hepatic, neural, neoplastic, skeletal 

CC and haematopoietic disorders 

XX 

SQ Sequence 3567 BP; 690 A; 1115 C; 1000 G; 762 T; 0 U; 0 Other; 

Query Match 17.8%; Score 608.4; DB 4; Length 3567; 

Best Local Similarity 56.7%; Pred. No. 1.6e-165; 

Matches 1226; Conservative 0; Mismatches 901; Indels 37; Gaps 4; 

Qy 67 GC AT C ACCT CT GAAT CT T GAAGAC C C T AAT GT GT GT AGC C ACT GG GAAAGCT ACT C AGT G 126 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 236 GCTGGAACACTCAACTCCAATGATCCCAATGTCTGTACCTTCTGGGAAAGCTTCACCACG 295 

Qy 127 ACT GT GCAAGAGT CAT AC C CACAT C C CTTT GATCAAATTTACTACACGAGCTGCACT 183 

II I Mill III I Mill I II I II M 

Db 29 6 ACCACTAAGGAGTCCCACCTTCGCCCCTTCAGCCTGCCCCCAGCCGAGTCCTGCGACAGG 355 

Qy 18 4 GACAT T CT AAAC T GGT T T AAAT G C AC GC GG CAC AGAGT C AG CT AT C GGAC AGC CT AT C GA 243 

I I I I I III I I II I II I I I I II I I I I I I I 

Db 356 CCCTGGGAAGACCCCCACACCTGCGCTCAGCCTACGGTTGTCTACCGGACTGTGTACCGT 415 

Qy 244 C AT GGGGAGAAGACT AT GT AT AG GC G CAAGT CT CAGT GT T GT CCT GGAT T TT AT GAAAGC 303 

I I I I I I I I I I I II I I I I I I I I I I I II I I I I I 

Db 416 CAGGTGGTGAAGATGGACTCCCGCCCACGCCTGCAGTGCTGTGGGGGTTACTACGAGAGC 475 

Qy 304 GGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCTCCA 363 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

Db 47 6 AGTGGAGCCTGTGTCCCACTCTGTGCCCAGGAGTGTGTCCACGGTCGCTGTGTGGCTCCT 535 

Qy 364 AACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGATGGT 423 

II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

Db 53 6 AATCGGTGCCAGTGTGCACCAGGCTGGCGGGGTGACGACTGTTCCAGTGAGTGTGCTCCT 595 

Qy 424 GATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGCAAC 483 

I I I I I I I I I I I I I I I I I I I I I I III I III 

Db 596 GGAATGTGGGGACCACAGTGTGACAGGCTCTGCCTCTGTGGCAACAGCAGTTCCTGTGAT 655 

Qy 484 CCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGACCGC 543 

I I I I I I I I I I I I I I I II I I I I I I I I I II 

Db 656 CCCAGGAGTGGGGTGTGTTTTTGCCCCTCTGGCCTGCAGCCCCCCGACTGCCTTCAGCCT 715 

Qy 54 4 T GT GAGC AG GG C AC CT AT GGTAAC GACT GT CAT C AGAGAT G C CAGT GC CAGAAT GGAG C C 603 

II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

Db 716 TGCCCCGATGGCCACTATGGTCCTGCCTGCCAGTTTGATTGCCATTGC TATGGGGCA 772 

Qy 604 ACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTCTGT 663 

I I I I I I I I I III III I I I I II I I I I I I I I I I I I 

Db 77 3 TCCTGTGACCCCCGGGATGGAGCCTGCTTCTGCCCCCCAGGGAGAACAGGACCCAGGGCA 832 

Qy 664 GAG GAT CTTTGTCCTCCT G GT AAAC AT GGT C CAC AGT GT GAG C AG AGAT GCCCTTGT C AA 72 3 

I I I I I I I III I I I I I I I I I 

Db 833 CTGATGGCTTCTTCTGCCCCAGAAC TTATCCTTGCCAA 870 

Qy 724 AATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGATGGGC 783 

I I I I I I I I I I Ml I II III I I I I I I I I I I I I I I I I I 

Db 871 AATGGAGGTGTTCCTCAGGGCTCTCAAGGCTCCTGCAGCTGCCCACCGGGCTGGATGGGT 930 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



784 ACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAATGC 

I I I I I I I I I II I I I II I I I I I M I I I I I I I I I M I I I 

931 GTCATCTGTTCCCTGCCATGCCCAGAGGGTTTCCACGGACCCAACTGTACTCAGGAATGT 

844 C AGT GC C AT AAT GGAG GGAC GT GT GAT G CT GC C AC AG G C C AAT GT CAT T G C AGT C C AG GA 

I INN II I I I I I I I I I I I I I I I M 

991 CGTTGCCACAATGGTGGCCTTTGTGACAGGTTTACTGGGCAGTGCCACTGTGCTCCTGGC 



843 



990 



903 



1050 



963 



904 TACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGTGCT 

I I I I I I I I I I I I I I II II II I I I I I I I 

1051 TATATCGGGGATCGGTGCCGTGAAGAGTGCCCTGTGGGCCGCTTCGGTCAAGACTGTGCT 1110 

964 GAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTCTGT 1023 

I I I I I I I I I I I I I III III I I I I I I I I I I I I M 

1111 GAGACCTGTGACTGTGCTCCTGGCGCTCGTTGCTTTCCTGCCAATGGCGCGTGTCTGTGC 117 0 

1024 GAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGC 1083 

Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1171 GAACAT GG CTT C ACAG GCGACC GCT GCACT GAGC GACT CT GT C CAGAT GG CCGCTAT GGT 1230 

1084 AT CAAAT GT GACAAACGGT GTC C CT GCCACT T GGAAAACACT CATAGCT GT CACCCCAT G 114 3 

II || I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1231 CT GAGCT G C CAAGAT C C CT GC AC CT GC GAC C CAGAAC ACAGT CT C AG CT G C CAC C CAAT G 1290 

1144 T CT GGAGAGT GT GC CT GCAAGCC GGGCT GGT CAGGACT CT ACT GTAAT GAGAC AT GT T CT 12 03 

I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I II M 
1291 CACGGCGAGTGCTCCTGCCAGCCAGGTTGGGCGGGCCTCCACTGCAACGAGAGCTGCCCT 1350 

1204 CCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGACTGT 12 63 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I M 

1351 CAGGACACGCACGGAGCCGGTTGCCAGGAGCACTGCCTCTGTCTGCACGGCGGTGTTTGC 1410 

1264 GACAGT GT GACT GGAAAGT GCAC CT GT GCCC CAGGAT T CAAAGGAAT T GACT GCT CT AC C 1323 

I I I II III I I I I I I I I I I II IN I I I I I I I I I 

1411 CTCGCCGACAGCGGCCTCTGCCGGTGTGCACCTGGCTACACGGGACCTCACTGCGCTAAT 147 0 

1324 CCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGAT 13 8 3 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

1471 CTTTGTCCACCTAACACTTATGGGATCAACTGTTCCTCCCACTGCTCCTGTGAAAATGCC 1530 

1384 GCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGAC 14 4 3 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II M 

1531 ATTGCCTGCTCTCCTGTCGACGGCACGTGCATCTGCAAGGAAGGTTGGCAGCGTGGTAAC 1590 

1444 TGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGC 1503 

I I I I I I MINI I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1591 TGCTCTGTGCCCTGTCCCCCTGGCACCTGGGGCTTCAGTTGCAATGCCAGTTGCCAGTGT 1650 

1504 CTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGC 1563 

I I I I I I I I I I I I I I I I II M I I I I I I I I I I I I I I I I I I I 

1651 GCCCACGAGGGAGTCTGCAGCCCCCAAACTGGAGCCTGTACTTGCACCCCTGGGTGGCGT 1710 

1564 GGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGC 1623 

I I I I I I I I I I I I I I I I I I I I I I I M M I I I I I I 

1711 GGGGTTCACTGCCAACTTCCGTGCCCGAAGGGACAGTTTGGTGAAGGTTGTGCCAGTGTC 17 7 0 

1624 TGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCG 1683 



Db 1771 T GT GACT GT GAC C ACT C CGAT GG CT GT GAC C CT GTT CAT GGACACTGC CGAT GT CAG G CT 1830 

Qy 1684 GGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGC 174 3 

I I I I I i I I II II II I I I I II I I I I I MINIM 

Db 1831 GGCTGGATGGGCACACGTTGCCACCTGCCTTGCCCAGAGGGCTTTTGGGGAGCCAACTGC 18 9 0 

Qy 1744 TCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAG 1803 

I MM I I I I I II I I I I I M I I I II I I I I I I I I I I 

Db 1891 AGCAATGCCTGTACCTGCAAGAATGGTGGCACTTGTGTACCTGAGAACGGCAACTGTGTG 1950 

Qy 1804 TGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGG 1863 

I I I I II I I I I I I I II I II I I I I I I I I I I I I I I I I I M I II II I M 

Db 1951 TGCGCACCAGGGTTCAGAGGCCCCTCCTGCCAGAGGCCCTGCCCGCCTGGTCGCTATGGC 2010 

Qy 1864 CATCGCT GCAGCCAGACAT GC C CACAGT GCGTT CACAGCAGCGGGCCCT GC CACCACAT C 1923 

I I I I II II I I I I I I I I I II I I 

Db 2011 AAACGCTG TGTGCCCTGCAAGTGCAACAACCATTCTTCCTGCCACCCGTCG 2061 

Qy 1924 ACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCC 1983 

M II I I I I M I II M I M I I I I MM MM I I I I I I 

Db 2062 GATGGGACCTGCTCCTGCCTGGCAGGCTGGACAGGCCCTGACTGCTCTGAATCATGTCCC 2121 

Qy 1984 AGT GG C AGAT T T GG GAAAAACT GT G C AGGAAT T T GT AC CT GCAC CAAC AAC GGAAC CT GT 2 043 

Ml III II II I I II II I I I II II 

Db 2122 CCAGGCCACTGGGGACTCAAATGCTCCCAACCCTGCCAGTGTCATCATGGTGCCACCTGC 2181 

Qy 2044 AACCCCATTGACAGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCTCAA 2103 

II I II II I II I II I M II II I I I II II I II I I M 

Db 2182 CACCCCCAGGATGGGAGCTGTGTCTGCATCCCAGGCTGGACTGGACCCAACTGCTCGGAA 2241 

Qy 2104 C CAT GT C C AC CT GC C CACT G GGGC C CAAACT GC AT C C AC AC GT GC AACT GC C AT AAT G GA 2163 

I I I II I III II I I II II I II I I I II II I I 

Db 2242 GGCTGCCCATCAAGAATGTTTGGTGTCAACTGCTCCCAGCTATGTCAGTGTGATCCTGGA 2301 

Qy 2164 GCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTCTAC 2223 

I I I I I II I I II II II II I I I II I II II 

Db 2302 GAGAT GT GCCAC CCAGAGACT GGGGCTT GCGT CT GT CCC CCAGGACACAGT GGT GCGCAC 2361 

Qy 2224 TGCA 2227 

MM 

Db 2362 TGCA 2365 



RESULT 10 
ABA00059 

ID ABA00059 standard; cDNA; 3574 BP. 
XX 

AC ABA00059; 
XX 

DT 25-OCT-2002 (first entry) 
XX 

DE CADHP-6 coding sequence, Incyte ID No: 4097936CB1. 
XX 

KW Gene; human; cell adhesion protein; CADHP; AIDS; Alzheimer f s disease; 

KW acquired immunodeficiency syndrome; thymic dysplasia; epilepsy; 

KW renal tubular acidosis; congenital glaucoma; cancer; atherosclerosis; 



KW 
XX 
OS 
XX 
FH 
FT 
FT 



Parkinson's disease; ss. 



Homo sapiens. 



FT 



Key 
CDS 



Location/Qualifiers 
243. .3227 
/*tag= a 

/product^ n CADHP-6" 



XX 

PN WO200259312-A2. 
XX 

PD 01-AUG-2002. 
XX 

PF 18-DEC-2001; 2001WO-US049206 . 
XX 

PR 18-DEC-2000; 2000US-0256542P . 

PR 22-DEC-2000; 2000US-0259604P . 

PR 05-JAN-2001; 2001US-0260101P . 
XX 

PA (INCY-) INCYTE GENOMICS INC. 
XX 

PI Duggan BM, Xu Y, Lee EA, Lee S, Lu DAM, Warren BA, Yue H; 

PI Gietzen KJ, Honchell .CD, Burford N, Baughn MR, Tang TY, Hillman JL; 

PI Gandhi AR, Kallick DA, Bandman O, Graul RC, Walia NK, Lu Y; 

PI Ramkumar J, Yao MG, Lai PG; 

XX 

DR WPI; 2002-590826/63. 

DR P-PSDB; AAG79417. 
XX 

PT New human cell adhesion proteins (CADHP) useful for treating, diagnosing 

PT and preventing diseases or conditions associated with the aberrant CADPH 

PT expression e.g. cancer, acquired immunodeficiency syndrome, Alzheimer's 

PT disease and epilepsy. 
XX 

PS Claim 5; Page 141-42; 149pp; English. 
XX 

CC The sequences given in ABA00054-63 encode novel human cell adhesion 

CC proteins (CADHP) . The CADHP polypeptides and polynucleotides are useful 

CC in treating, diagnosing and preventing diseases or conditions associated 

CC with the decreased expression or overexpression of CADHP, e.g. immune 

CC system (acquired immunodeficiency syndrome, thymic dysplasia), 

CC neurological (Alzheimer's disease, Parkinson's disease, epilepsy), 

CC developmental (renal tubular acidosis, congenital glaucoma) and cell 

CC proliferative (cancer, atherosclerosis) disorders. They are also useful 

CC in assessing the effects of exogenous compounds on the expression of 

CC nucleic acid and amino acid sequences of CADHP . The CADHP or its 

CC fragments are useful in screening compounds for effectiveness as agonist 

CC or antagonist of the polypeptides, or in altering the expression of the 

CC target polynucleotide and compounds that specifically bind to or modulate 

CC the activity of the polypeptide. The protein encoded by this cDNA 

CC sequence shows homology to rat MEGF6 

XX 

SQ Sequence 3574 BP; 626 A; 1218 C; 1045 G; 685 T; 0 U; 0 Other; 



Query Match 17.5%; Score 600.6; DB 6; Length 3574; 

Best Local Similarity 57.4%; Pred. No. 2.9e-163; 

Matches 1165; Conservative 0; Mismatches 849; Indels 15; Gaps 



4; 



Qy 62 GGACAGCAT CAC CTCT GAAT CT T GAAGAC CCT AAT GT GT GT AG CC ACT G GGAAAGCT ACT 121 

II II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 289 GGCTGGCTGG7\ACTCTCAACCCCAGTGATCCCAATACCTGCAGCTTCTGGGAAAGCTTCA 348 

Qy 122 C AGT GACT GT G C AAG AGT CAT AC C CAC AT C C C T T T GAT C AAAT T TACT AC AC GAG C T G C - 180 

I II I I I I I I I I I I I I II I I I II I I I I I 

Db 349 CTACCACCACCAAGGAGTCCCACTCCCGCCCCTTCAGCCTGCTCCCCTCAGAGCCCTGCG 408 

Qy 181 — ACTGACATT CTAAACT GGTTT AAAT GCACGCGGCACAGAGT CAGCT AT CGGACAGC CT 238 

I I I I I I I I I I I I I I I I I I I I I I I I 

Db 4 09 AGCGGCCCTGGGAGGGCCCCCATACTTGCCCCCAGCCCACGGTTGTATACCGGACCGTGT 468 

Qy 239 AT CGACAT GGGGAGAAGACTAT GTAT AGGCGCAAGT CT CAGT GTT GTCCT GGATTTTAT G 298 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

Db 4 69 ACCGTCAGGTGGTGAAGACGGACCACCGCCAGCGCCTGCAGTGCTGCCATGGCTTCTATG 528 

Qy 2 99 AAAGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTG 358 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 529 AGAGCAGGGGGTTCTGTGTCCCGCTCTGTGCCCAGGAGTGTGTCCATGGCCGTTGTGTGG 588 

Qy 359 CTCCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCG 418 

I M | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I III 

Db 589 CACCCAATCAGTGCCAATGTGTGCCAGGCTGGCGGGGCGACGACTGTTCCAGTGAGTGTG 64 8 

Qy 419 ATGGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGT 478 

I I I I I I I I I I I I I I I I II II 

Db 649 CCCCAGGAATGTGGGGGCCACAGTGTGACAAGCCCTGCAGCTGCGGCAACAACAGCTCGT 708 

Qy 47 9 GCAACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGG 538 

I I I I I I I I I I I II II I I I I I III I I I I I I 

Db 709 GTGATCCCAAGAGTGGGGTATGTTCTTGCCCTTCTGGTCTGCAGCCCCCGAACTGCCTTC 768 

Qy 539 ACCGCT GT GAG CAGGGCACCT AT GGT AAC GACT GT CAT C AGAGAT GC CAGT GC CAGAAT G 598 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I III 
Db 7 69 AGCCCTGTACCCCTGGCTACTATGGCCCTGCCTGCCAGTTCCGCTGCCAGTGCC ATG 825 

Qy 599 GAGCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCT 658 

I I I I I I I I I I I I I I I I I Ml I I I I I I III I I I I I I 

Db 826 GGGCACCCTGCGATCCCCAGACTGGAGCCTGCTTCTGCCCCGCAGAGAGAACTGGGCCCA 885 

Qy 659 T CT GT GAGGAT CTTT GT CCT CCT GGT AAACAT GGT CCACAGT GT GAGCAGAGAT GCCCTT 718 

I I I I I I I I II I I I I I III M I I I I I 

Db 886 GCTGTGACGTGTCCTGTTCCCAGGGCACTTCTGGCTTCTTCTGCCCCAGCACCCATCCTT 945 

Qy 719 GTCAAAATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGA 778 

I I I I I I I I I I I I I I I II I I I Ml I I I I I I I I I I I I I I 

Db 946 GCCAAAATGGAGGTGTCTTCCAAACCCCACAGGGCTCCTGCAGCTGCCCCCCTGGCTGGA 1005 

Qy 779 TGGGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAG 838 

I I I I I I I III I I I I I I I I I I I I I I III I I I I I I I I I I I 

Db 1006 TGGGCACCATCTGCTCCCTGCCCTGCCCAGAGGGCTTTCACGGACCCAACTGCTCCCAGG 1065 



Qy 839 
Db 1066 



AAT GCC AGT G C CATAAT GGAGGGAC GT GT GAT GCT GC CAC AG GC CAAT GT C ATT GCAGT C 

I I I I I II Ill I I I I I I M I M I II 

AATGTCGCTGCCACAACGGCGGCCTCTGTGACCGATTCACTGGGCAGTGCCGCTGCGCTC 



898 
1125 



Qy 899 CAGGAT ACACAGGGGAACGGT GCCAGGATGAGT GTCCT GTT GGGACCTAT GGCGTTCT CT 958 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1126 CGGGTTACACTGGGGATCGGTGCCGGGAGGAGTGCCCGGTGGGCCGCTTTGGGCAGGACT 1185 

Qy 959 GTGCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCC 1018 

I I I I I I I I I I I I I I II I I I I II I II I I I I I I I I I I I I 

Db 1186 GTGCTGAGACGTGCGACTGCGCCCCGGACGCCCGTTGCTTCCCGGCCAACGGCGCATGTC 1245 

Qy 1019 TCTGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCT 1078 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 1246 TGTGCGAACACGGCTTCACTGGGGACCGCTGCACGGATCGCCTCTGCCCCGACGGCTTCT 1305 

Qy 1079 ACGGCAT CAAAT GT GACAAAC GGT GT C CCT GCCACTT GGAAAACACTCATAGCT GTCACC 1138 

I I I I III II I I II I I I I I I I III III I I I I I I II I I 

Db 1306 ACGGTCTCAGCTGCCAGGCCCCCTGCACCTGCGACCGGGAGCACAGCCTCAGCTGCCACC 1365 

Qy 1139 CCAT GT C T GGAGAGT GT GC CT GCAAG C C GGGCT GGT CAGGACT CT ACT GT AAT GAGACAT 1198 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1366 CGATGAACGGGGAGTGCTCCTGCCTGCCGGGCTGGGCGGGCCTCCACTGCAACGAGAGCT 1425 

Qy 1199 GTTCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAG 1258 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

Db 1426 GCCCGCAGGACACGGATGGGCCAGGGT.GC.GAGGAGCACTGTCTCTGCCTGCAGGGTGGCG 148 5 

Qy 1259 ACT GT GACAGT GT GACT GGAAAGT GCACCT GT GCCCCAGGATT CAAAGGAATT GACT GCT 1318 

Ml I I I II II I I I I I I I I I I I II I I II I 

Db 1486 TCTGCCAGGCTACCAGCGGCCTCTGTCAGTGCGCGCCGGGTTACACGGGCCCTCACTGTG 154 5 

Qy 1319 CTACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAA 1378 

III I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

Db 1546 CTAGTCTTTGTCCTCCTGACACCTACGGTGTCAACTGTTCTGCACGCTGCTCATGTGAAA 1605 

Qy 1379 ATGATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGG 1438 

III I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II 

Db 1606 ATGCCATCGCCTGCTCACCCATCGACGGCGAGTGCGTCTGCAAGGAAGGTTGGCAGCGTG 1665 

Qy 1439 TGGACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCC 1498 

I I I I I I I I MM I I I I I I I II II I Mill I I I II 

Db 1666 GTAACTGCTCTGTGCCCTGCCCACCCGGAACCTGGGGCTTCAGTTGCAATGCCAGCTGCC 1725 

Qy 14 99 AGTGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGAT 1558 

I I I I I I I I I II I II I I I I M II I I II I I II I I I I I I I 

Db 1726 AGT GT GCC CAT GAGGCAGT C T GCAGC C CC CAAACT GGAGC C T GT AC CT GCACCC CT GGGT 1785 

Qy 1559 GGCGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTG 1618 

III MM I I I I I I I I I M I I I I II II II II I I I 

Db 178 6 GGCATGGGGCCCACTGCCAGCTGCCCTGTCCGAAGGGGCAGTTTGGAGAAGGTTGTGCCA 1845 

Qy 1619 AGCGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCC 1678 

I II I I I I I I I I II I I I I I I II I I I I I I I II I III I I I I 

Db 1846 GT CGCT GT GACT GT GACCACT CT GAT GGCT GTGACCCTGTT CAT GGACGCT GTCAGT GCC 1905 

Qy 1679 TCCCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCA 1738 

I I I I I I I II I II II I II M I I I II I I I II I I II 

Db 1906 AGGCTGGCTGGATGGGTGCCCGCTGCCACCTGTCCTGCCCTGAGGGCTTATGGGGAGTCA 1965 

Qy 1739 ACTGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCT 1798 



Db 1966 ACTGTAGCAACACCTGCACCTGCAAGAATGGGGGCACCTGTCTCCCTGAGAATGGCAACT 2025 

Qy 1799 GCGAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTT 1858 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml I I I I I I 

Db 2 02 6 GCGTGTGTGCACCCGGATTCCGGGGCCCCTCCTGCCAGAGATCCTGTCAGCCTGGCCGCT 2085 

Qy 1859 ATGGGCATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACC 1918 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2 08 6 ATGGCAAAC GCT GT GT GC CCTGCAAGTGCGCTAACCACTCC TTCTGCCACC 2136 

Qy 1919 ACATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGT 1978 

I I II I II I I I I I I I I I I I I I I II I I I I I I I II I I 

Db 2137 CCTCGAACGGGACCTGCTACTGCCTGGCTGGCTGGACAGGCCCCGACTGCTCCCAGCGCT 2196 

Qy 1979 GT C CCAGT GGCAGATT T GG GAAAAACT GT GCAGGAATT T GT AC CT GC ACCAACAAC GGAA 2 038 

III I I I I I I I I I MM! I I I M Ml 

Db 2197 GCCCTCTGGGGACATTTGGTGCTAACTGCTCCCAGCCATGCCAGTGTGGTCCTGGAGAAA 2256 

Qy 2 039 C CT GT AAC C C CAT TGACAGAT CT T GT CAGT GTT AC C C CGGTT GGAT T GG 2087 

II MM I MM III I II I I I Ml 

Db 2257 AGTGCCACCCAGAGACTGGGGCCTGTGTATGTCCCCCAGGGCACAGTGG 2305 



RESULT 11 
AAS86746 

ID AAS86746 standard; cDNA; 1402 BP. 
XX 

AC AAS86746; 
XX 

DT 13-FEB-2002 (first entry) 
XX 

DE DNA encoding novel human diagnostic protein #22550. 
XX 

KW Human; chromosome mapping; gene mapping; gene therapy; forensic; 

KW food supplement; medical imaging; diagnostic; genetic disorder; ss. 
XX 

OS Homo sapiens. 
XX 

PN WO200175067-A2. 
XX 

PD ll-OCT-2001. 
XX 

PF 30-MAR-2001; 2001WO-US008631 . 
XX 

PR 31-MAR-2000; 2Q00US-00540217 . 

PR 23-AUG-2000; 2000US-00649167 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Drmanac RT, Liu C, Tang YT; 
XX 

DR WPI; 2001-639362/73. 

DR P-PSDB; ABG22559. 
XX 

PT New isolated polynucleotide and encoded polypeptides , useful in 

PT diagnostics, forensics, gene mapping, identification of mutations 



PT responsible for genetic disorders or other traits and to assess 

PT biodiversity. 

XX 

PS Claim 1; SEQ ID NO 22550; 103pp; English. 
XX 

CC The invention relates to isolated polynucleotide (I) and polypeptide (II) 

CC sequences. (I) is useful as hybridisation probes, polymerase chain 

CC reaction (PCR) primers, oligomers, and for chromosome and gene mapping, 

CC and in recombinant production of (II) . The polynucleotides are also used 

CC in diagnostics as expressed sequence tags for identifying expressed 

CC genes. (I) is useful in gene therapy techniques to restore normal 

CC activity of (II) or to treat disease states involving (II). (II) is 

CC useful for generating antibodies against it, detecting or quantitating a 

CC polypeptide in tissue, as molecular weight markers and as a food 

CC supplement. (II) and its binding partners are useful in medical imaging 

CC of sites expressing (II) . (I) and (II) are useful for treating disorders 

CC involving aberrant protein expression or biological activity. The 

CC polypeptide and polynucleotide sequences have applications in 

CC diagnostics, forensics, gene mapping, identification of mutations 

CC responsible for genetic disorders or other traits to assess biodiversity 

CC and to produce other types of data and products dependent on DNA and 

CC amino acid sequences. AAS64 197-AAS94564 represent novel human diagnostic 

CC coding sequences of the invention. Note: The sequence data for this 

CC patent did not appear in the printed specification, but was obtained in 

CC electronic format directly from WIPO at 

CC f tp. wipo. int/pub/published_pct_sequences 

XX 

SQ Sequence 1402 BP; 356 A; 335 C; 413 G; 298 T; 0 U; 0 Other; 

Query Match 17.0%; Score 581.6; DB 5; Length 1402; 

Best Local Similarity 88.5%; Pred. No. 5.7e-158; 

Matches 710; Conservative 0; Mismatches 9; Indels 83; Gaps 4; 



Qy 



Db 



412 GCCTGCGATGGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGG 471 

| | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

135 GCCTGCGATGGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGG 194 



Qy 



Db 




Qy 



Db 



52 8 GCGCT GCGAGGACCGCTGT GAGCAGGGCACCT ATGGTAACGACTGT CAT CAGAGAT GCCA 587 

| | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I It I I I I I I I I I I I I II I I I I I I I I I I I 

255 GCGCTGCGAGGACCGCTGTGAGCAGGGCACCTATGGTAACGACTGTCATCAGAGATGCCA 314 



Qy 



Db 



588 GT GC CAGAAT G GAG C C AC CT GC GAC CAC GT CAC GGG GGAATG C CGCT GC C CAC CAGGATA 647 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
315 GTGCCAGAATGGAGCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATA 374 



Qy 



Db 



648 CAC C GGAG C CT T CT GT GAGGAT CTTTGTCCT C CT GGT AAACAT GGT C C ACAGT GT GAGCA 7 07 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I 
375 CACCGGAGCCTTCTGTGAGGATCTTTGTCCTCCTGGTAAACATGGTCCACAGTGTGAGCA 434 



Qy 7 08 GAGAT GC C CTT GT CAAAAT GGAG GAGT GT GT CAT CAC GT C ACT GGAGAAT GCTCTTGCCC 7 67 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

Db 435 GAGAT GCCCTT GTCAAAAT GGAGGAGTGT GT CATCACGT CACT GGAGAATGCTCTT GCCC 4 94 



Qy 768 TTCTGGCTGGAT 1 /y 

I II I I I I I I II I 

Db 4 95 TTCTGGCTGGATGTTGTCTTTCCCTGGCTGGAGGCCCATCTAATTTTCCAAGTCTCTTTG 554 

Qy 7 80 GGGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTC 833 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 

Db 555 AATGCAGGGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTC 614 

Qy 834 C C AAGAAT G C C AGT G C C AT AAT G GAG G G AC GT GT GAT G C T G C C AC AG G C C AAT GT CAT T G 893 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

Db 615 CCAAGAAT GC CAGT GC CAT AAT GGAGGGAC GT GT GAT G CT G CC ACAGG C CAAT GT CAT T G 674 

Qy 894 CAGTCCAGGATACACAGGGGAAC GGTGCCAGGATG 928 

M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

Db 675 CAGT C CAGGATAC ACAGGGGAAC GAGCAG C AGTT C C GGAT GTT AGAAAGGT GC CAGGAT G 734 

Qy 929 AGTGTCCTGTTGGGACCTATGGCGTTCTCTGTGCTGAGACCTGCCAGTGTGTCAACGGAG 988 

| | | | | | | | | | | | | I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I M I I I 

Db 735 AGTGTCCTGTTGGGACCTATGGCGTTCTCTGTGCTGAGACCTGCCAGTGTGTCAACGGAG 794 

Qy 98 9 GGAAGTGTTACCACGTGAGCGGCGCATGCCTCTGTGAAGCAGGCTTTGCTGGCGAGCGCT 104 8 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 795 GGAAGTGTTACCACGTGAGCGGCGCATGCCTCTGTGAAGCAGGCTTTGCTGGCGAGCGCT 854 

Qy 104 9 GCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAATGTGACAAACGGTGTCCCT 1108 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 

Db 855 GCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAATGTGACAAACGGTGTCCCT 914 

Qy 1109 G CCACT T GGAAAACACT CAT AG 1130 

I I I I I I I I I I I I I I I I I I I I I 
Db 915 GC CAC C T G GAAAACAC T CAT AG 936 

RESULT 12 
AAS72218 

ID AAS72218 standard; cDNA; 936 BP. 
XX 

AC AAS72218; 
XX 

DT 13-FEB-2002 (first entry) 
XX 

DE DNA encoding novel human diagnostic protein #8022. 
XX 

KW Human; chromosome mapping; gene mapping; gene therapy; forensic; 

KW food supplement; medical imaging; diagnostic; genetic disorder; ss. 
XX 

OS Homo sapiens. 
XX 

PN WO200175067-A2. 
XX 

PD ll-OCT-2001. 
XX 

PF 30-MAR-2001; 2001WO-US008631 . 
XX 

PR 31-MAR-2000; 2000US-00540217 . 

PR 23-AUG-2000; 2000US-00649167 . 
XX 



PA (HYSE-) HYSEQ INC. 
XX 

PI Drmanac RT, Liu C, Tang YT; 
XX 

DR WPI; 2001-639362/73. 

DR P-PSDB; ABG08031. 
XX 

PT New isolated polynucleotide and encoded polypeptides, useful in 

PT diagnostics, forensics, gene mapping, identification of mutations 

PT responsible for genetic disorders or other traits and to assess 

PT biodiversity. 
XX 

PS Claim 1; SEQ ID NO 8022; 103pp; English. 
XX 

CC The invention relates to isolated polynucleotide (I) and polypeptide (II) 

CC sequences. (I) is useful as hybridisation probes, polymerase chain 

CC reaction (PCR) primers, oligomers, and for chromosome and gene mapping, 

CC and in recombinant production of (II) . The polynucleotides are also used 

CC in diagnostics as expressed sequence tags for identifying expressed 

CC genes. (I) is useful in gene therapy techniques to restore normal 

CC activity of (II) or to treat disease states involving (II) . (II) is 

CC useful for generating antibodies against it, detecting or quantitating a 

CC polypeptide in tissue, as molecular weight markers and as a food 

CC supplement. (II) and its binding partners are useful in medical imaging 

CC of sites expressing (II) . (I) and (II) are useful for treating disorders 

CC involving aberrant protein expression or biological activity. The 

CC polypeptide and polynucleotide sequences have applications in 

CC diagnostics, forensics, gene mapping, identification of mutations 

CC responsible for genetic disorders or other traits to assess biodiversity 

CC and to produce other types of data and products dependent on DNA and 

CC amino acid sequences. AAS64197-AAS94564 represent novel human diagnostic 

CC coding sequences of the invention. Note: The sequence data for this 

CC patent did not appear in the printed specification, but was obtained in 

CC electronic format directly from WIPO at 

CC ftp . wipo . int/pub/published_pct_sequences 

XX 

SQ Sequence 936 BP; 200 A; 239 C; 258 G; 239 T; 0 U; 0 Other; 

Query Match 16.5%; Score 563.8; DB 5; Length 936; 

Best Local Similarity 97.9%; Pred. No. 6.8e-153; 

Matches 571; Conservative 0; Mismatches 12; Indels 0; Gaps 0; 

Qy 1130 GCTGTCACCCCATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTA 118 9 

I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 95 GCTGTCACCCCATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTA 154 

Qy 1190 AT GAGAC AT GT T CT C CT GGAT T CT ACG GGGAAGCT T GC C AG CAGAT CT G CAG CT GCCAAA 1249 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 155 ATGAGACATGTTCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAA 214 

Qy 1250 AT G GGGC AGACT GT GACAGT GT GACT GGAAAGT GC ACCT GT G CCC CAG GATT CAAAGGAA 1309 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 215 AT GGGG C AGACT GT GACAGT GT GACT G GAAAGT GC AC CT GT G CC C CAG GAT T CAAAGGAA 274 

Qy 1310 TTGACTGCTCTACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTG 1369 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 275 TTGACTGCTCTACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTG 334 



Qy 


1370 


GCTGTAAAAATGATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGC7VAGGCAGGCT 


1429 




I I I I I I I | | | | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 




Db 


335 


GCTGTAAAAATGATGCAGTCTGCTCTCCTGrGGACGGGlCl 1G1AU1 1 bCAAbbruaijij^ i 




OV 


1430 


GGCACGG GGT GGACT GCT CC AT CAGAT GT C C CAGT GGC ACAT GGG GCT TT GGCT GT AACT 


1489 




M 1 | | 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


395 


GGCACGGGGTGGACTGCTCCATCAGATGTCCCAGTGGCACA1 GGGGCi 1 1 


A ^4 


OV 


1490 


TAACATGCCAGTGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTG 


1549 




| | | | | | I I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


455 


TAACATGCCAGTGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGALU I GCAUbl CjI <j 


m a 

01*1 


Ov 


1550 


CACCTGGATGGCGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGA 


1609 




I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


515 


CACCTGGATGGCGCGGGGAGAAATGCGAACTTCCCTGCCAGGA1 GGL.AL.Cj1 ALLfLfLjLI bA 


S7 A 


Qy 


1610 


ACTGTGCTGAGCGCTGCGACTGCAGCCACGCAGAIGGCiGCCALCClAUUAL-bbbUCi\l i 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 M 1 




Db 


575 


ACTGTGCTGAGCGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATT 


634 


Qy 


1670 


GCCGCTGCCTCCCGGGATGGTCAGGTGTCCACTGTGACAGCGT 1712 






1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M M 1 1 M 




Db 


635 


GCCGCTGCCTCCCCGGATGGTCAGTGTTCACCGGAAATGGTGT 677 





RESULT 13 
ACD05889 

ID ACD05889 standard; cDNA; 936 BP. 
XX 

AC ACD05889; 
XX 

DT 06-AUG-2003 (first entry) 
XX 

DE Novel human contig #63. 
XX 

KW Human; angiogenesis ; cytokine; cell proliferation; pluripotent; 

KW cell differentiation; totipotent; stem cell; transplantation; bio-sensor; 

KW neuroepithelial cell; autoimmune disease; neural cell; genetic disorder; 

KW nerve; brain tissue; central nervous system disease; 

KW peripheral nervous system disease; neuropathy; haematopoiesis ; bone; 

KW myeloid disorder; lymphoid cell disorder; platelet disorder; tendon; 

KW regeneration; cartilage; tendon; ligament; nerve tissue growth; 

KW tissue repair; wound healing; burn; ulcer; osteoporosis; cancer; 

KW osteoarthritis; bone degenerative disorder; periodontal disease; 

KW gut protection; lung fibrosis; liver fibrosis; reperfusion injury; 

KW immune deficiency; infection; autoimmune disorder; allergic reaction; 

KW thrombolysis; thrombosis; coagulation disorder; hereditary disorder; 

KW biorhythm; circadian cycle; fertility; metabolism; catabolism; anabolism; 

KW nootropic; neuroprotective; antiparkinsonian; anticonvulsant; 

KW haemostatic; vulnerary; antiulcer; osteopathic; antiarthritic; 

KW vasotropic; immunostimulant ; antibacterial; fungicide; immunosuppressive; 

KW antirheumatic; antidiabetic; antiasthmatic; cytostatic; virucide; 

KW expressed sequence tag; EST; ss. 

XX 

OS Homo sapiens. 
XX 



PN WO2003023013-A2. 
XX 

PD 20-MAR-2003. 
XX 

PF 13-SEP-2002; 2002WO-US029001 . 
XX 

PR 13-SEP-2001; 2001US-0322511P . 

PR 12-SEP-2002; 2002US-00243552 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Tang YT, Yang Y, Wang Z, Weng G, Ma Y; 
XX 

DR WPI; 2003-313249/30. 

DR P-PSDB; ABO00812. 
XX 

PT Novel nucleic acids and polypeptides for diagnosis, treatment of central 

PT and peripheral nervous system diseases and neuropathies, such as 

PT Alzheimer's, Parkinson 1 s disease, Huntington 1 s disease, amyotrophic 

PT lateral sclerosis. 

XX 

PS Example 2; SEQ ID NO 735; 300pp; English. 
XX 

CC The present invention relates to the isolation of novel human 

CC polynucleotide sequences and their encoding polypeptides. The novel 

CC polypeptides exhibit activities relating to angiogenesis , cytokine, cell 

CC proliferation, cell differentiation, antiinflammatory, and stem cell 

CC growth factor activities. The polypeptides are involved in the 

CC proliferation, differentiation and survival of pluripotent and totipotent 

CC stem cells, and are useful for re-engineering damaged or diseased 

CC tissues, transplantation, manufacture of bio-pharmaceuticals and 

CC development of bio-sensors. The polypeptides can be used to manipulate 

CC stem cells in culture to give rise to neuroepithelial cells that can be 

CC used to augment or replace cells damaged by illness, autoimmune disease, 

CC accidental damage or genetic disorders. The polypeptides induce the 

CC proliferation of neural cells and regeneration of nerve and brain tissue 

CC and are useful for the treatment of central and peripheral nervous system 

CC diseases and neuropathies, such as Alzheimer's, Parkinson's disease, 

CC Huntington's disease, amyotrophic lateral sclerosis (ALS). The 

CC polypeptides are also involved in chemotactic or chemokinetic activity, 

CC regulation of haematopoiesis and are useful for treating myeloid or 

CC lymphoid cell disorders, platelet disorders such as thrombocytopaenia and 

CC for regeneration of bone, cartilage, tendon, ligament and/or nerve tissue 

CC growth, in tissue repair, healing of burns, incisions, ulcers, for 

CC treating osteoporosis, osteoarthritis, bone degenerative disorders, and 

CC periodontal disease. The polypeptides are also useful for gut protection 

CC or regeneration and treatment of lung or liver fibrosis, reper fusion 

CC injury in various tissues, various immune deficiencies and disorders 

CC including severe combined immunodeficiency (SCID) , bacterial or fungal 

CC infections, autoimmune disorders (e.g. multiple sclerosis, rheumatoid 

CC arthritis, diabetes mellitus, myasthenia gravis), allergic reactions and 

CC conditions, such as asthma or other respiratory problems. The 

CC polypeptides are involved in thrombolysis or thrombosis and are useful in 

CC treatment of various coagulation disorders (including hereditary 

CC disorders such as haemophilia) or to enhance coagulation and other 

CC haemostatic events in treating wounds resulting from trauma, surgery or 

CC other causes. The polypeptides exhibit immune stimulating or immune 



CC suppressing activity, and are useful for treating autoimmune diseases or 

CC cancer. They also inhibit the growth, infection or function of infectious 

CC agents such as bacteria, fungi, viruses, effect biorhythms or circadian 

CC cycles of rhythms, fertility of male or female subjects, metabolism, 

CC catabolism, and anabolism. ACD05827-ACD06027 represent novel contigs 

CC assembled using expressed sequence tag (EST) sequences as seeds. Note: 

CC The sequence data for this patent did not form part of the printed 

CC specification, but was obtained in electronic format directly from WIPO 

CC at ftp.wipo.int/pub/published_pct_sequences 



SQ Sequence 936 BP; 200 A; 239 C; 258 G; 239 T; 0 U; 0 Other; 

Query Match 16.5%; Score 563.8; DB 7; Length 936; 

Best Local Similarity 97.9%; Pred. No. 6.8e-153; 

Matches 571; Conservative 0; Mismatches 12; Indels 0; Gaps 0; 



Qy 


1130 


GCTGTCACCCCATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTA 


1189 




1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
i i i i i i i i \ \ \ \ i i i i i { i i i i i i i i i i i t i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 




Db 


95 


GCTGTCACCCCATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTA 


154 


Qy 


1190 


AT GAGAC AT GT T CT C CT GGATT CT AC GGGGAAGCT T GC C AGC AGAT CT GCAG C T GC CAAA 


1249 




II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 

I | | | | | | | | | | 1 1 1 l 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 » ■ 1 1 l l l l l l i i i i i i r i i 




Db 


155 


AT GAGAC AT GT T CT C CTG GATT CT AC GGGGAAGCT T GC C AG C AGAT CT GCAGCT GC CAAA 


214 


Qy 


1250 


AT GGGGC AGACT GT GACAGT GT GACT GGAAAGT GC AC CT GT G CC C C AGGATT CAAAGGAA 


1309 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 




Db 


215 


ATGGGGCAGACT GT GACAGTGT GACT GGAAAGTGCACCT GT GCCCCAGGATT CAAAGGAA 


274 


Qy 


1310 


TTGACTGCTCTACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTG 


1369 




1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 




Db 


275 


TTGACTGCTCTACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTG 


334 


Qy 


1370 


GCTGTAAAAATGATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCT 


1429 




1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


335 


GCTGTAAAAATGATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCT 


394 


Qy 


1430 


GGCACGGGGTGGACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACT 


1489 




1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


395 


GGCACGGGGTGGACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACT 


454 


Qy 


1490 


TAACATGCCAGTGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTG 


1549 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


455 


TAACATGCCAGTGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTG 


514 


Qy 


1550 


CACCTGGATGGCGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGA 


1609 




1 1 1 1 1 1 M 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


515 


CACCTGGATGGCGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGA 


574 


Qy 


1610 


ACTGTGCTGAGCGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATT 


1669 




M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


575 


ACTGTGCTGAGCGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATT 


634 


Qy 


1670 


GCCGCTGCCTCCCGGGATGGTCAGGTGTCCACTGTGACAGCGT 1712 




Db 


635 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M II 1 Ml 

GCCGCTGCCTCCCCGGATGGTCAGTGTTCACCGGAAATGGTGT 677 





RESULT 14 
ACD05589 

ID ACD05589 standard; cDNA; 936 BP. 
XX 

AC ACD05589; 
XX 

DT 06-AUG-2003 (first entry) 
XX 

DE cDNA encoding novel human polypeptide #99. 
XX 

KW Human; angiogenesis ; cytokine; cell proliferation; pluripotent; 

KW cell differentiation; totipotent; stem cell; transplantation; bio-sensor; 

KW neuroepithelial cell; autoimmune disease; neural cell; genetic disorder; 

KW nerve; brain tissue; central nervous system disease; 

KW peripheral nervous system disease; neuropathy; haematopoiesis ; bone; 

KW myeloid disorder; lymphoid cell disorder; platelet disorder; tendon; 

KW regeneration; cartilage; tendon; ligament; nerve tissue growth; 

KW tissue repair; wound healing; burn; ulcer; osteoporosis; cancer; 

KW osteoarthritis; bone degenerative disorder; periodontal disease; 

KW gut protection; lung fibrosis; liver fibrosis; reperfusion injury; 

KW immune deficiency; infection; autoimmune disorder; allergic reaction; 

KW thrombolysis; thrombosis; coagulation disorder; hereditary disorder; 

KW biorhythm; circadian cycle; fertility; metabolism; catabolism; anabolism; 

KW nootropic; neuroprotective; antiparkinsonian; anticonvulsant; 

KW haemostatic; vulnerary; antiulcer; osteopathic; antiarthritic; 

KW vasotropic; immuno stimulant ; antibacterial; fungicide; immunosuppressive; 

KW antirheumatic; antidiabetic; antiasthmatic; cytostatic; virucide; gene; 

KW ss . 

XX 

OS Homo sapiens. 
XX 

PN WO2003023013-A2 . 
XX 

PD 20-MAR-2003. 
XX 

PF 13-SEP-2002; 2002WO-US029001 . 
XX 

PR 13-SEP-2001; 2001US-0322511P . 

PR 12-SEP-2002; 2 002US-00243552 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Tang YT, Yang Y, Wang Z, Weng G, Ma Y; 
XX 

DR WPI; 2003-313249/30. 

DR P-PSDB; ABO00512. 
XX 

PT Novel nucleic acids and polypeptides for diagnosis, treatment of central 

PT and peripheral nervous system diseases and neuropathies, such as 

PT Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

PT lateral sclerosis. 

XX 

PS Claim 1; SEQ ID NO 99; 300pp; English. 
XX 

CC The present invention relates to the isolation of novel human 

CC polynucleotide sequences and their encoding polypeptides. The novel 

CC polypeptides exhibit activities relating to angiogenesis, cytokine, cell 



CC proliferation, cell differentiation, antiinflammatory, and stem cell 

CC growth factor activities. The polypeptides are involved in the 

CC proliferation, differentiation and survival of pluripotent and totipotent 

CC stem cells, and are useful for re-engineering damaged or diseased 

CC tissues, transplantation, manufacture of bio-pharmaceuticals and 

CC development of bio-sensors. The polypeptides can be used to manipulate 

CC stem cells in culture to give rise to neuroepithelial cells that can be 

CC used to augment or replace cells damaged by illness, autoimmune disease, 

CC accidental damage or genetic disorders. The polypeptides induce the 

CC proliferation of neural cells and regeneration of nerve and brain tissue 

CC and are useful for the treatment of central and peripheral nervous system 

CC diseases and neuropathies, such as Alzheimer's, Parkinson's disease, 

CC Huntington's disease, amyotrophic lateral sclerosis (ALS) . The 

CC polypeptides are also involved in chemotactic or chemokinetic activity, 

CC regulation of haematopoiesis and are useful for treating myeloid or 

CC lymphoid cell disorders, platelet disorders such as thrombocytopaenia and 

CC for regeneration of bone, cartilage, tendon, ligament and/or nerve tissue 

CC growth, in tissue repair, healing of burns, incisions, ulcers, for 

CC treating osteoporosis, osteoarthritis, bone degenerative disorders, and 

CC periodontal disease. The polypeptides are also useful for gut protection 

CC or regeneration and treatment of lung or liver fibrosis, reperfusion 

CC injury in various tissues, various immune deficiencies and disorders 

CC including severe combined immunodeficiency (SCID) , bacterial or fungal 

CC infections, autoimmune disorders (e.g. multiple sclerosis, rheumatoid 

CC arthritis, diabetes mellitus, myasthenia gravis), allergic reactions and 

CC conditions, such as asthma or other respiratory problems. The 

CC polypeptides are involved in thrombolysis or thrombosis and are useful in 

CC treatment of various coagulation disorders (including hereditary 

CC disorders such as haemophilia) or to enhance coagulation and other 

CC haemostatic events in treating wounds resulting from trauma, surgery or 

CC other causes. The polypeptides exhibit immune stimulating or immune 

CC suppressing activity, and are useful for treating autoimmune diseases or 

CC cancer. They also inhibit the growth, infection or function of infectious 

CC agents such as bacteria, fungi, viruses, effect biorhythms or circadian 

CC cycles of rhythms, fertility of male or female subjects, metabolism, 

CC catabolism, and anabolism. ACD05491-ACD05826 represent the novel cDNA 

CC sequences of the invention. Note: The sequence data for this patent did 

CC not form part of the printed specification, but was obtained in 

CC electronic format directly from WIPO at 

CC ftp . wipo . int/pub/published_pct_sequences 

XX 

SQ Sequence 936 BP; 200 A; 239 C; 258 G; 239 T; 0 U; 0 Other; 

Query Match 16.5%; Score 563.8; DB 7; Length 936; 

Best Local Similarity 97.9%; Pred. No. 6.8e-153; 

Matches 571; Conservative 0; Mismatches 12; Indels 0; Gaps 0; 

Qy 1130 GCTGTCACCCCATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTA 1189 

I I I I I I I I i I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 

Db 95 GCTGTCACCCCATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTA 154 

Qy 1190 ATGAGACATGTTCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAA 1249 

I II I I I I I I I I I I I M I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I 
Db 155 AT GAGACATGT T CT C CT GGAT T CT ACG G GGAAGCT T GC CAG CAGAT CT GCAGCT G CCAAA 214 



Qy 1250 ATGGGGCAGACTGTGACAGTGTGACTGGAAAGTGCACCTGTGCCCCAGGATTCAAAGGAA 1309 

M I I II I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 



Db 


215 


Qy 


1310 


Db 


275 


Qy 


1370 


Db 


335 


Qy 


1430 


Db 


395 


Qy 


1490 


Db 


455 


Qy 


1550 


Db 


515 


Qy 


1610 


Db 


575 


Qy 


1670 


Db 


635 



215 AT GGG GCAGACTGT GAC AGT GT GACT GGAAAGT G C AC CT GT GC C CCAGGAT T CAAAGGAA 274 

TTGACTGCTCTACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTG 1369 

I I I I I I I I I I II I I I I I I I I I I I I 

TTGACTGCTCTACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTG 334 

GCTGTAAAAATGATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCT 1429 

III I I! I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 

GCTGTAAAAATGATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCT 394 

GGCACGGGGTGGACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACT 148 9 

Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GGCACGGGGTGGACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACT 454 

TAACATGCCAGTGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTG 154 9 

I I I i I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TAACATGCCAGTGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTG 514 

CACCTGGATGGCGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGA 1609 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CACCTGGATGGCGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGA 574 

ACTGTGCTGAGCGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATT 1669 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

ACTGTGCTGAGCGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATT 634 

GCCGCTGCCTCCCGGGATGGTCAGGTGTCCACTGTGACAGCGT 1712 

I I I I I I I I I I I I I I I I I I I I I I I II II I Ml 

GCCGCTGCCTCCCCGGATGGTCAGTGTTCACCGGAAATGGTGT 67 7 



RESULT 15 


AAS72219 


ID 


AAS72219 standard; cDNA; 2295 BP. 


XX 




AC 


AAS72219; 


XX 




DT 


13-FEB-2002 (first entry) 


XX 




DE 


DNA encoding novel human diagnostic protein #8023. 


XX 




KW 


Human; chromosome mapping; gene mapping; gene therapy; forensic; 


KW 


food supplement; medical imaging; diagnostic; genetic disorder; ss 


XX 




OS 


Homo sapiens. 


XX 




PN 


WO200175067-A2. 


XX 




PD 


ll-OCT-2001. 


XX 




PF 


30-MAR-2001; 2001WO-US008631 . 


XX 




PR 


31-MAR-2000; 2000US-0054 0217 . 


PR 


23-AUG-2000; 2000US-00649167 . 


XX 




PA 


(HYSE-) HYSEQ INC. 



PI Drmanac RT, Liu C, Tang YT; 
XX 

DR WPI; 2001-639362/73. 

DR P-PSDB; ABG08032. 
XX 

PT New isolated polynucleotide and encoded polypeptides, useful in 

PT diagnostics, forensics, gene mapping, identification of mutations 

PT responsible for genetic disorders or other traits and to assess 

PT biodiversity. 
XX 

PS Claim 1; SEQ ID NO 8023; 103pp; English. 
XX 

CC The invention relates to isolated polynucleotide (I) and polypeptide (II) 

CC sequences. (I) is useful as hybridisation probes, polymerase chain 

CC reaction (PCR) primers, oligomers, and for chromosome and gene mapping, 

CC and in recombinant production of (II). The polynucleotides are also used 

CC in diagnostics as expressed sequence tags for identifying expressed 

CC genes. (I) is useful in gene therapy techniques to restore normal 

CC activity of (II) or to treat disease states involving (II). (II) is 

CC useful for generating antibodies against it, detecting or quantitating a 

CC polypeptide in tissue, as molecular weight markers and as a food 

CC supplement. (II) and its binding partners are useful in medical imaging 

CC of sites expressing (II) . (I) and (II) are useful for treating disorders 

CC involving aberrant protein expression or biological activity. The 

CC polypeptide and polynucleotide sequences have applications in 

CC diagnostics, forensics, gene mapping, identification of mutations 

CC responsible for genetic disorders or other traits to assess biodiversity 

CC and to produce other types of data and products dependent on DNA and 

CC amino acid sequences. AAS64197-AAS94564 represent novel human diagnostic 

CC coding sequences of the invention. Note: The sequence data for this 

CC patent did not appear in the printed specification, but was obtained in 

CC electronic format directly from WIPO at 

CC ftp . wipo . int/pub/published_pct_sequences 

XX 

SQ Sequence 2295 BP; 610 A; 543 C; 602 G; 540 T; 0 U; 0 Other; 

Query Match 16.2%; Score 554; DB 5; Length 2295; 

Best Local Similarity 97.3%; Pred. No. 8.2e-150; 

Matches 574; Conservative 0; Mismatches 15; Indels 1; Gaps 1; 
Qy 2834 GG GACAGGAT GACT GT CAC GAAGT CAAAAAACAAT CAACT GTT T GT GAAT CTT AAAAAT G 2893 




Db 



575 G GGAGG G AC C AACT GC T C C AGT GT CAAAAAACAAT CAACT GT T T GT GAAT C T T AAAAAT G 634 



Qy 



2894 TGAACCCTGGGAAGAGAGGCCCTGTGGGGGACTGCACTGGGACATTGCCGGCTGACTGGA 2953 




Db 



635 TGAACCCTGGGAAGAGAGGCCCTGTGGGGGACTGCACTGGGACATTGCCGGCTGACTGGA 694 



Qy 



Db 



2954 AACATGGCGGCTACCTCAACGAGCTCGGTGCTTTTGGACTTGACAGAAGCTATATGGGAA 3013 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

695 AACATGGCGGCTACCTCAACGAGCTCGGTGCTTTTGGACTTGACAGAAGCTATATGGGAA 754 



Qy 



3014 AATCCTTAAAAGACCTGGGAAAGAATTCTGAATATAATTCAAGTAACTGCTCCCTAAGCA 307 3 




Db 



755 AAT CCT T AAAAGACCT GGGAAAGAATT CT GAAT AT AATTCAAGT AACT GCT C CCT AAGCA 814 



Qy 



3074 GT T CT GAGAAC C CAT AT GC CACT ATT AAAGAC C CAC CT GT ACT TAT C CC GAAAAGCT CAG 3133 



Db 


815 


Ov 


3134 


Db 


875 




3194 


Db 


935 




3254 




995 


Qy 


3314 


Db 


1054 


Qy 


3374 


Db 


1114 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I I II 1 1 1 1 II 1 1 1 1 1 1 

GTTCTGAGAACCCATATGCCACTATTAAAGACCCACCTGTACTTATCCCGAAAAGCTCAG 874 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I ! I I I I I I I I I I I I 

AGT GT GGT TAT GT G GAGAT GAAAT CGC C GG CAC GAAGAGAT T C C C C AT AT GC AGAGAT CA 934 

AT AACT C AACT T CAGC C AAC AGG AAT GT CT AT GAAGT T GAAC C T ACAGT GAGT GT T GT C C 3253 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I 

AT AACT CAACT TCAGC CAACAGGAAT GT CT AT GAAGT T GAAC CT ACAGT GAGT GTT GT C C 9 94 

AAGGAGT ATT CAGCAAT AATGGGC GT CT CT C CCAGGAT CCAT AT GACCTCCCAAAGAACA 3313 

I M I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I II I I I I I I I I I I 

AAGGAGTATTCAGCAATAATGGGCGTCTTTCCCAGGATCCATATGACCTCCC-AAGAACA 1053 

GT CACAT CC CT T GT CAT TAT GAC CT GCT GC CAGT CCGAGAC AGT T CAT C CT CCC CT AAGC 337 3 

I I I I I I I I I I I I I I I I I! I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

GTCACATCCCTTGTCATTATGACCTGCTGCCAGTCCGAGACAGTTCATCCTCCCTTAAGC 1113 

AAGAGGACAGT GGAGGT AGCAGCAGCAACAGCAGCAGCAGCAGT GAAT GA 3423 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I II II I I I M I I I I I 

AAGAGGACAGT GGT GGT AG CAGCAGCAAC AG CAGC AGCAG CAGT GAAT GA 1163 



Search completed: March 30, 2004, 02:37:09 
Job time : 1291 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 

March 30, 2004, 02:09:56 ; Search time 235 Seconds 

(without alignments) 
8083.391 Million cell updates/sec 

US-10-092-390-1 
3423 

1 atggttatttctttgaactc gcagcagcagcagtgaatga 3423 

IDENTITY_NUC 
Gapop 10.0 , Gapext 1.0 

682709 seqs, 277475446 residues 

Total number of hits satisfying chosen parameters: 1365418 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 
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Listing first 45 summaries 
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ALIGNMENTS 



RESULT 1 
US-09-130-491-9 

; Sequence 9, Application US/09130491 

; Patent No. 6416974 

; GENERAL INFORMATION: 

; APPLICANT: Holtzman, Douglas A. 

APPLICANT: Goodearl, Andrew D.J. 
; TITLE OF INVENTION: TANGO-71, TANGO-73, TANGO-74, TANGO-7 6, AND TANGO-83 

FILE REFERENCE: 09404/041001 
; CURRENT APPLICATION NUMBER: US/09/130, 491 
; CURRENT FILING DATE: 1998-08-07 
; EARLIER APPLICATION NUMBER: US 60/058,108 
; EARLIER FILING DATE: 1997-09-05 
; EARLIER APPLICATION NUMBER: US 60/054,961 
; EARLIER FILING DATE: 1997-08-06 
; NUMBER OF SEQ ID NOS : 16 

; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 9 



LENGTH: 144 8 

TYPE : DNA 
; ORGANISM: Homo sapiens 
US-09-130-491-9 



Query Match 41.7%; Score 1425.8; DB 4; Length 1448; 

Best Local Similarity 99.9%; Pred. No. 0; 

Matches 1427; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1216 G GGGAAGCTT GCCAGC AGAT CT GCAG CT GC CAAAAT GGG GCAGACT GT GACAGT GT GACT 127 5 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 

Db 18 GGGGAAGCTT GCCAGCAGAT CT GCAGCT GC CAAAAT GGGGCAGACT GT GACAGT GT GACT 77 

Qy 127 6 GGAAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCTACCCCATGCCCTCTG 1335 

I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I ! 

Db 7 8 GGAAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCTACCCCATGCCCTCTG 137 

Qy 1336 GGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCT 1395 

|| | | I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I M I I I I I I I 

Db 138 GGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCT 197 

Qy 1396 CCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGA 14 55 

I MINIMI I I I I I I I I I I I I I I M I M I M I I I I I I I I 

D b 198 CCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGA 257 

Qy 14 56 TGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGGGA 1515 

|| | | | | | | I I M I I M I I I I M M I I I I M I I I I I II II I I I I M 

D b 2 58 TGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGGGA 317 

Qy 1516 GCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAATGC 1575 

M I I I I I M I I I I I I I I I I I M M I I II I I I M I I M I I I I I I I 

Db 318 GCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAATGC 377 

Qy 1576 GAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGC 1635 

| M I I I I I I I I I M I I I I I I I I I I I I I M M I I I I I I I M I I I I I I I M I I I I M I I I I I 
Db 37 8 GAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGC 437 

Qy 1636 CACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCGGGATGGTCAGGT 1695 

M I M I I I I I I I M I M I I I I I I M I M M I I I M I M I I M M I M M I I I I I I I I M 

Db 438 CACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCCGGATGGTCAGGT 4 97 

Qy 1696 GTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGC 1755 

| | | | || M | II I I I I I I I I I I I I I I I I I I II M II I I I I I I I I I M I I I I I I I I I I I I I I 

Db 498 GTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGC 557 

Qy 1756 TACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCACCAGGC 1815 

I II M I I I M I II I I II M M I II I M I I I I I I I II I I II I I I M I I II I I M I II II I I 

Db 558 TACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCACCAGGC 617 

Qy 1816 TTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGC 1875 

MINIM I M M M M I I I I I I I INI I I I I I I I I I 

Db 618 TTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGC 677 

Qy 1876 CAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCACATCACCGGCCTGTGT 1935 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I 

Db 678 CAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCACATCACCGGCCTGTGT 737 



Qy 


1936 


Db 


738 


Qy 


1996 


Db 


798 


Qy 


2056 


Db 


858 


Qy 


2116 


Db 


918 


Qy 


2176 


Db 


978 


Qy 


2236 


Db 


1038 


Qy 


2296 


Db 


1098 


Qy 


2356 


Db 


1158 


Qy 


2416 


Db 


1218 


Qy 


2476 


Db 


1278 


Qy 


2536 


Db 


1338 


Qy 


2596 


Db 


1398 



GACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCAGATTT 1995 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I M I 

GACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCAGATTT 797 

G GGAAAAACT GT G C AGGAATT T GT AC CT GC AC CAAC AAC GGAAC CT GTAAC C C CAT T GAC 2055 

MM II I I I I I I I I II I I I I I I I II I I II I I I II II I I 

GG GAAAAACT GT G CAG GAAT TT GT AC CT GCAC CAAC AAC GGAAC CT GT AACCC CAT T GAC 857 

AGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCTCAACCATGTCCACCT 2115 

II I I I I I I I I I M I I I I I I I I I II II I I II I II I I I I I II I I I I I I M I I 

AGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCTCAACCATGTCCACCT 917 

GCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCTGCAGC 217 5 

M II M I I II I II I I I I I I I II I I I I I I I M I I I II II I I M I I I I II 

GCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCTGCAGC 977 



II M M I II I I I I II I II I II I I I I M II II II I II I I M I I I I I I I I I I I I M I I II II 

GCCTACGAT GGGGAAT GT AAAT GCACT C CT GGCT GGACAGGGCT CT ACT GCACT CAGAGA 1037 

TGTCCTCTAGGGTTTTATGGAAAAGATTGTGCACTGATATGCCAATGTCAAAACGGAGCT 22 95 
M M I II II I I I I I I I I II II II I I I I I I I I II I II I I I I I I I I I M I I I I I II I I II II 
T GT CCTCTAGGGTTT TAT GGAAAAGATTGT GCACT GATATGCCAAT GT CAAAACGGAGCT 1097 

GACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTCATGGGACGGCACTGT 2355 

II || | M II I II I I I I I II M I I I I I I I I I I II I II I I I I I I I I I M I I I I I I I M M M 

GACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTCATGGGACGGCACTGT 1157 

GAGCAGAAGT GCCCTT CAGGAACAT AT GGCT ATGGCT GT CGCCAGATAT GT GATT GT CT G 2 415 

II II II I I II II I II I II M II I I I II I I I I II I M I II I II II I M I I I I I I I II M II 

GAGCAGAAGT GCCCT T CAGGAACAT AT G GCT AT G G C T GT C GC CAGATAT GT GATT GT CT G 1217 

AACAACTCCACCTGCGACCACATCACTGGGACCTGTTACTGCAGCCCCGGATGGAAGGGA 2475 

M II I I II I I II I II I II II II I M I I I II I II I II I M I I I I I M I M I I I I I I I II II 

AACAACT CCACCTGCGACCACATCACT GGGACCT GTTACTGCAGCCCC GGAT GGAAGGGA 1277 

GC GAGAT GT GAT CAAGCT G GT GT TAT C AT AGT T GGAAAT CT GAAC AGCT T AAGC C GAAC C 2 535 

II M II I I I I I I II I I II M II I I M I I I I I II I II I II I I II I I M II I I M I II II I I 

GC GAGAT GT GAT CAAGCT GGT GTT AT CAT AGT T GGAAAT C T GAACAGCT T AAGC C GAAC C 1337 

AGTACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCAGGCATCATCATTCTT 2595 

|| I M I I I I II I I II I I M II I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AGTACTGCTCTCCCTGCTGATTCCTACCAAATCGGGGCCATTGCAGGCATCATCATTCTT 1397 

GT C CT AGT TGTTCTCTT C CT ACT GGC AT T GT T CAT TAT T TAT AG AC AC A 2 644 

I I M II I I I II I I II I I I I I II I II I II II I I I M M I I I I I I II I II I 

GT C CT AGTT GT T CT CTT CCT ACT GGCAT T GT T CAT T AT T TAT AG AC AC A 1446 



RESULT 2 

US-09-833-381-1910/c 

; Sequence 1910, Application US/09833381 

; Patent No. 6672186 

; GENERAL INFORMATION: 

; APPLICANT: Robison, Keith E. 

; TITLE OF INVENTION: No. 6672186el Nucleic Acid and Protein Homologs 
; FILE REFERENCE: 5800-119 



; CURRENT APPLICATION NUMBER: US/ 09/ 833 , 38 1 

; CURRENT FILING DATE: 2001-04-11 

; PRIOR APPLICATION NUMBER: 09/516,448 

; PRIOR FILING DATE: 2000-02-29 

; NUMBER OF SEQ ID NOS : 2050 

SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 1910 
LENGTH: 5197 
; TYPE: DNA 
; ORGANISM: Homo sapiens 
; FEATURE : 

; NAME /KEY: misc_f eature 
LOCATION: (1) . . . (5197) 
; OTHER INFORMATION: n = A,T,C or G 
US-09-833-381-1910 



Query Match 15.3%; Score 524.6; DB 4; Length 5197; 

Best Local Similarity 58.0%; Pred. No. 1.3e-152; 

Matches 972; Conservative 0; Mismatches 694; Indels 11; Gaps 



Qy 


531 


CT GC GAGGACC G CT GT GAGCAGG GCAC CT AT G GTAAC GACT GT CAT CAGAGAT G C C AGT G 

MM 1 1 M 1 1 1 Ml M M M 1 Ml M 1 M M 1 1 1 1 
i i i i i i i i \ i i iii i i i i i r i lit ii i 

-CTGCCTTCAGCCCTGTACCCCT.GGCTACTATGGCCCTGCCTGCCAGTTCCGCT-GCCAGTG 


590 


Db 


4243 


4184 


Qy 


591 


CCAGAATGGAGCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACAC 

M 1 M 1 1 1 II II II 1 1 1 1 1 1 1 1 Ml 1 1 II II III II 

cc ATGGGGCACCCTGCGATCCCCAGACTGGAGCCTGCTTCTGCCCCGCAGAGAGAAC 


650 


Db 


4183 


4127 


Qy 


651 


C GGAGCCT T CTGT GAGGAT CTT T GT C CT C C T GGTAAACAT GGTC CAC AGT GT GAGC AGAG 

1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 Ml l> 1 

TGGGCCCAGCTGTGACGTGTCCTGTTCCCAGGGCACTTCTGGCTTCTTCTGCCCCAGCAC 


710 


Db 


4126 


4067 


Qy 


711 


AT GC C CTT GT CAAAAT GGAGGAGT GT GT CAT CAC GT CACT GG AGAATGCT CT T GCC CT T C 

M 1 M II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 

CCATCCTTGCCAAAATGGAGGTGTCTTCCAAACCCCACAGGGCTCCTGCAGCTGCCCCCC 


770 


Db 


4066 


4007 


Qy 


771 


TGGCTGGATGGGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTG 

1 1 II 1 M II II 1 1 M III 1 1 II 1 1 1 1 1 1 M 1 1 III Mill 

TGGCTGGATGGGCACCATCTGCTCCCTGCCCTGCCCAGAGGGCTTTCACGGACCCAACTG 


830 


Db 


4006 


3947 


Qy 


831 


TT CCCAAGAAT GCCAGT GCCAT AAT GGAGGGACGT GT GAT GCT GC CACAGGCCAAT GT CA 

1 II M II M 1 1 1 II 1 1 II 1 1 1 1 Mill 1 II 1 1 M 1 1 1 

CTCCCAGGAATGTCGCTGCCACAACGGCGGCCTCTGTGACCGATTCACTGGGCAGTGCCG 


890 


Db 


3946 


3887 


Qy 


891 


TTGCAGTCCAGGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGG 

1 II 1 II 1 1 1 1 1 II 1 1 M 1 1 II 1 1 1 1 1 II 1 M 1 1 M 1 1 M II Ml 

CTGCGCTCCGGGTTACACTGGGGATCGGTGCCGGGAGGAGTGCCCGGTGGGCCGCTTTGG 


950 


Db 


3886 


3827 


Qy 


951 


CGTTCTCTGTGCT GAGAC CT G C C AGT GT GT C AAC GGAG GGAAGT GTT AC CAC GT GAGC G G 

1 1 1 1 1 II 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 

GCAGGACTGTGCTGAGACGTGCGACTGCGCCCCGGACGCCCGTTGCTTCCCGGCCAACGG 


1010 


Db 


3826 


3767 


Qy 


1011 


CGCATGCCTCTGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGA 

I 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 1 1 1 1 1 

CGCATGTCTGTGCGAACACGGCTTCACTGGGGACCGCTGCACGGATCGCCTCTGCCCCGA 


1070 


Db 


3766 


3707 


Qy 


1071 


GG GGCT CTAC GGCAT CAAAT GT GACAAACGGT GT C CC T GC C ACTT GGAAAACACT CAT AG 


1130 



II 1 1 1 1 1 II 1 1 1 1 III III I II 

Db 3706 CGGCTTCTACGGTCTCAGCTGCCAGGCCCCCTGCACCTGCGACCGGGAGCACAGCCTCAG 3647 

Qy 1131 CTGTCACCCCATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAA 1190 

I I I I I I I I I I I I I II I I I I I I I I I 

Db 3646 CTGCCACCCGATGAACGGGGAGTGCTCCTGCCTGCCGGGTTGGGCGGGCCTCCACTGCAA 3587 

Qy 1191 TGAGACATGTTCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAA 1250 

I I I I II II I I I I I II I I I I I I I I Ml I 

Db 3586 CGAGAGCTGCCCGCAGGACACGCATGGGCCAGGGTGCCAGGAGCACTGTCTCTGCCTGCA 3527 

Qy 1251 T GGGG CAGACT GT GACAGT GT GACT GGAAAGT G C AC CT GT GC C CC AGGAT T CAAAG GAAT 1310 

II I I III I I III II I I I I I I I I I I I I I 

Db 352 6 CGGTGGCGTCTGCCAGGCTACCAGCGGCCTCTGTCAGTGCGCGCCGGGTTACACGGGCCC 3467 

Qy 1311 TGACTGCTCTACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGG 137 0 

I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 34 66 TCACTGTGCTAGTCTTTGTCCTCCTGACACCTACGGTGTCAACTGTTCTGCACGCTGCTC 3407 

Qy 1371 CTGTAAAAATGATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTG 1430 

I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

Db 34 06 ATGTGAAAATGCCATCGCCTGCTCACCCATCGACGGCGAGTGCGTCTGCAAGGAAGGTTG 3347 

Qy 14 31 GCACGGGGTGGACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTT 14 90 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 334 6 GCAGCGTGGTAACTGCTCTGTGCCCTGCCCACCCGGAACCTGGGGCTTCAGTTGCAATGC 32 87 

Qy 1491 AACATGCCAGTGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGC 1550 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I M I I I 

Db 3286 CAGCTGCCAGTGTGCCCATGAGGCAGTCTGCAGCCCCCAAACTGGAGCCTGTACCTGCAC 3227 

Qy 1551 ACCTGGATGGCGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAA 1610 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M 

Db 3226 CCCTGGGTGGCATGGGGCCCACTGCCAGCTGCCCTGTCCGAAGGGGCAGTTTGGAGAAGG 3167 

Qy 1611 CTGTGCTGAGCGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTG 167 0 

Mill I I I I I I I I I I I II I I I I I I I I I I I I I I I III M 

Db 3166 TTGTGCCAGTCGCTGTGACTGTGACCACTCTGATGGCTGTGACCCTGTTCATGGACGCTG 3107 

Qy 1671 CCGCTGCCTCCCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTG 1730 

I I I M I I I I I I I I I I I I I I I II II I I I I I I I M 

Db 3106 TCAGTGCCAGGCTGGCTGGATGGGTGCCCGCTGCCACCTGTCCTGCCCTGAGGGCTTATG 3047 

Qy 1731 GGGCCCCAACTGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGA 1790 

II I I II I I I I II I I I I I I I I I I I I I I I III I I I I I I I 

Db 304 6 GGGAGT CAACT GTAGCAACAC CT GCAC CT GCAAGAAT GGGG G CAC CT GT CT C CCT GAGAA 2987 

Qy 1791 TGGCATCTGCGAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCC 1850 

I I I I I I I I I I I I I II I I I I II I I I I I III M I I I I I I I I I I I M 

Db 298 6 TGGCAACTGCGTGTGTGCACCCGGATTCCGGGGCCCCTCCTGCCAGAGATCCTGTCAGCC 2927 

Qy 1851 TGGTTTTTATGGGCATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCC 1910 

III I I I I I I I I I I I I I I I I I I I I I I I 

Db 292 6 TGGCCGCTATGGCAAACGCTGTG TGCCCTGCAAGTGCGCTAACCACTCCTTC 2 875 

Qy 1911 CTGCCACCACATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAA 197 0 

I || I I I I II II I I I I I I I I I I I I I I I I I I I I I I I 



Db 



2874 TGCCACCCCTTCGAACGGGACCTGCTACTGCCTGGCTGGCTGGACAGGCCCCGACTGCTC 2815 



Qy 1971 TGAAGTGTGTCCCAGTGGCAGATTTGGGAAAAACTGTGCAGGAATTTGTACCTGCACCAA 2030 

I I I I I II I II I II I I I I I II III II II 

Db 2814 CCAGCCATGCCCTCCAGGACACTGGGGAGAAAACTGTGCCCAGACCTGCCAATGTCACCA 2755 

Qy 2031 C AAC GGAAC C T GT AAC C C CAT T GACAGAT CT T GT C AGT GT T AC C C C GGT T GGAT T GG C AG 2090 

II I I I I I I I I I II I III II II I I I I I I I I I 

Db 2754 TGGTGGGACCTGCCATCCCCAGGATGGGAGCTGTATCTGCCCCCTAGGCTGGACTGGACA 2695 

Qy 2091 TGACTGCTCTCAACCATGTCCACCTGCCCACTGGGGCCCAAACTGCATCCACACGTGCAA 2150 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2694 CCACTGCTTAGAAGGCTGCCCTCTGGGGACATTTGGTGCTAACTGCTCCCAGCCATGCCA 2635 

Qy 2151 CTGCCATAAT GGAGCT TT CT GCAGCGCCTACGAT GGGGAAT GTAAAT GCACT CCT GG 2207 

II I II II I I I I II I Mill III Ml I II I I 

Db 2634 GTGTGGTCCTGGAGAAAAGTGCCACCCAGAGACTGGGGCCTGTGTATGTCCCCCAGG 257 8 



RESULT 3 

US-09-130-491-10 

; Sequence 10, Application US/09130491 

; Patent No. 6416974 

; GENERAL INFORMATION: 

; APPLICANT: Holtzman, Douglas A. 

; APPLICANT: Goodearl, Andrew D.J. 

; TITLE OF INVENTION: TANGO-71, TANGO-73, TANGO-74, TANGO-76, AND TANGO-83 

; FILE REFERENCE: 09404/041001 

; CURRENT APPLICATION NUMBER: US/09/130, 4 91 

; CURRENT FILING DATE: 1998-08-07 

; EARLIER APPLICATION NUMBER: US 60/058,108 

; EARLIER FILING DATE: 1997-09-05 

; EARLIER APPLICATION NUMBER: US 60/054,961 

; EARLIER FILING DATE: 1997-08-06 

; NUMBER OF SEQ ID NOS : 16 

; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 10 

LENGTH: 1578 

TYPE: DNA 
; ORGANISM: Homo sapiens 
; FEATURE: 

NAME/ KEY : misc_feature 
; LOCATION: (1) . . . (1578) 

OTHER INFORMATION: n = A,T,C or G 
US-09-130-491-10 

Query Match 4.8%; Score 165.8; DB 4; Length 1578; 

Best Local Similarity 95.5%; Pred. No. 6e-41; 

Matches 191; Conservative 0; Mismatches 6; Indels 3; Gaps 2; 
Qy 3227 AAGT T GAAC C T AC AGT GAGT GT T GT — CCAAGGAGTATTCAGCAATAATGGGCGTCTCTC 3284 

MM I I I II II II II I II II I M I II II II II M II I M M II M I M II M 

Db 25 AAGTGAACCTAACAGTGAGTGTTGTTCCCAAGGAGTATTCAGCAATAATGGGCGTCTNTC 84 

Qy 3285 C C - AG GAT C CAT AT GAC CT C C C AAAG AAC AGT C AC AT C C CT T GT CAT TAT G AC C T G C T G C 3343 

M I I I I II M M II I M II M I I M M I II II I I M I I I I I I I M II I II II II II I II 

Db 85 C CAAG GAT C CAT AT GAC C T C C C AAAGAAC AGT C AC AT C C C T T GT C ATT AT GAC C T GC T GC 144 



Qy 3344 CAGTCCGAGACAGTTCAT CCT CCCCTAAGCAAGAGGACAGT GGAGGTAGCAGCAGCAACA 3403 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

D b 145 CAGT CCGAGACAGTTCAT CCT CCCCTAAGCAAGAGGACAGT GGAGGTAGCAGCAGCAACA 204 



Qy 34 04 GCAG CAGC AGCAGT GAAT GA 3423 

I I I I I II I I I I I I I I I I I I I 
Db 205 GC AGCAG C AGCAGT GAAT GA 224 



RESULT 4 

US-09-188-930-255 

Sequence 255, Application US/09188930A 
Patent No. 6150502 
GENERAL INFORMATION: 
APPLICANT: Watson, James D. 
APPLICANT: Strachan, Lorna 
APPLICANT: Sleeman, Matthew 
APPLICANT: Onrust, Rene 
APPLICANT: Murison, James Greg 

TITLE OF INVENTION: Compositions Isolated From Skin Cells 
TITLE OF INVENTION: and Methods For Their Use 
FILE REFERENCE: 11000. lOllcl 

CURRENT APPLICATION NUMBER: US/ 09/ 188 , 930A 
CURRENT FILING DATE: 1998-11-09 
NUMBER OF SEQ ID NOS : 34 8 

SOFTWARE: FastSEQ for Windows Version 3.0 
SEQ ID NO 255 
LENGTH: 1464 
TYPE: DNA 
ORGANISM: Mouse 
US-09-188-930-255 

Query Match 4.6%; Score 156; DB 3; Length 1464; 

Best Local Similarity 51.2%; Pred. No. 6.5e-38; 

Matches 419; Conservative 0; Mismatches 390; Indels 9; Gaps 2; 

Qy 491 CCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGACCGCTGTGAGC 550 

I | | I I I I I I I I I I I I I I I I I M M IN I I I I I 

Db 9 CGGGAGCCTGCTACTGCCCTGCTGGGTTCCTTGGGGCCGACTGTAGCCTTGCCTGTCCAC 68 

Qy 551 AGGGCACCTAT GGTAACGACT GT CAT CAGAGAT GCCAGT GCCAGAAT GGAGCCACCT GC G 610 

M II II II I I I I I I 1 II M I I I M M I I I I 

Db 69 AGGGTCGCTTCGGCCCCAGCTGTGCCCACGTGTGTACATGCGGGCAAGGGGCGGCATGTG 128 

Q y 611 ACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTCTGTGAGGATC 67 0 

I M II I II I I III I I I I I I I I I I I I I I I I I I I I I I I 

Db 129 ACCCAGTGTCGGGGACTTGCATCTGTCCTCCCGGGAAGACGGGAGGCCATTGTGAGCGCG 18 8 

Qy 671 TTTGTCCTCCT GGT AAAC AT GGT C C AC AGT GT GAGCAGAG AT G C CCT T GT CAAAAT G GAG 730 

IMIil I III I I I I I I I I II III I I I I I I 

Db 189 GCTGTCCCCAGGACCGGTTTGGCAAGGGCTGTGAACACAAGTGTGCCTGCAGGAATGGGG 248 

Q y 731 GAGTGT GT CAT CACGT CACT GGAGAAT GCT CTT GCCCTT CT GGCT GGAT GGGCACAGT GT 7 90 

| | I I I I I I I I I I I I I I I I I I I I I I III IN M I 

Db 249 GCCTGTGTCATGCTACCAATGGCAGCTGCTCCTGCCCCCTGGGCTGGATGGGGCCACACT 308 



Qy 7 91 GTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAATGCCAGTGCC 850 

Ml II I I I I I I I I I I I I I I I I Ml II N II HI 

Db 309 GTGAGCACGCCTGCCCTGCTGGGCGCTATGGTGCTGCCTGCCTCCTGGAGTGTTCCTGTC 368 

Qy 851 AT AAT G GAG GGAC GT GT GAT GCT GC CAC AGGCCAAT GT CAT T G C AGT C CAGGAT AC AC AG 910 

Ml III I I I I I I I I I I I I M I M I I I I I I I I 

Db 369 AGAACAATGGCAGCTGTGAGCCCACCTCCGGCGCTTGCCTCTGTGGCCCTGGCTTCTATG 428 

Qy 911 GGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGTGCTGAGACCT 970 

IN I I I ' I I II Mil M IMM I M I 

Db 429 GTCAAGCTTGTGAAGACACCTGCCCTGCCGGCTTCCATGGATCTGGTTGCCAGAGAGTTT 488 

Qy 971 GCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTCTGTGAAGCAG 1030 

| | | | I I I I I I I MIMI I I I I M I I I I I II I I I I I 

Db 489 GCGAGTGTCAACAGGGCGCTCCCTGTGACCCTGTCAGTGGCCGGTGCCTCTGCCCTGCTG 548 

Qy 1031 GCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAAT 1090 

I I I I I II I I I I I II M I Ml I I I I I I I 

Db 54 9 GCTTCCGTGGCCAGTTCTGCGAGAGGGG GTGCAAGCCAGGCTTTTTTGGAGATGGCT 605 

Qy 1091 GT GACAAAC GGT GT CC CT GCCACTT GGAAAACACT CAT AGCT GT CACC CCAT GT CT GGAG 1150 

I I I M I I I M I I I I I I M I M 

Db 606 , GCCTGCAGCAGTGTAACTGCCCCACGGGTGTGCC -CTGTGATCCCATCAGCGGCC 659 

Qy 1151 AGT GT GCCT GCAAGC CGGGCT GGT CAGGACT CT ACT GTAAT GAGACAT GT T CT C CT G GAT 1210 

II I I I I I I I I I I M I I I I I I M M 

Db 660 T CT GCCTT T GCC CAC CAG G GC GC GCAGGAAC C ACAT GT GACCT AGAT T GC AGAAGAGGC C 719 

Qy 1211 TCTACGGGGT^AGCTTGCCAGCAGATCTGCAGCTGCCAA/VATGGGGCAGACTGTGACAGTG 1270 

M I M I II M I M I I M I M I I I I I I II I 

Db 72 0 GCTTTGGGCCGGGCTGTGCCCTGCGCTGTGATTGTGGGGGTGGGGCTGACTGCGACCCCA 77 9 

Qy 1271 T GACT GGAAAGT GC ACCT GTGC C CCAGGAT T CAAAGGA 1308 

I I I I I I II M I I I II I I M M I 

Db 7 80 TCAGTGGGCAGTGCCACTGTGTGGACAGCTACACGGGA 817 



RESULT 5 

US-09-312-283C-255 

; Sequence 255, Application US/09312283C 

; Patent No. 6573095 

; GENERAL INFORMATION: 

; APPLICANT: Watson, James D. 

; APPLICANT: Strachan, Lorna 

; APPLICANT: Sleeman, Matthew 

; APPLICANT: Onrust, Rene 

; APPLICANT: Murison, James G. 

; APPLICANT: Kumble, Krishanand D. 

TITLE OF INVENTION: Compositions Isolated from Skin Cells 
TITLE OF INVENTION: and Methods for Their Use 

; FILE REFERENCE: 11000. 1011c2 

; CURRENT APPLICATION NUMBER: US/09/312 , 283C 

; CURRENT FILING DATE: 1999-05-14 

; NUMBER OF SEQ ID NOS : 425 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 255 
LENGTH: 1464 



; TYPE: DNA 

ORGANISM: Mouse 
US-09-312-283C-255 

Query Match 4.6%; Score 156; DB 4; Length 1464; 

Best Local Similarity 51.2%; Pred. No. 6.5e-38; 

Matches 419; Conservative 0; Mismatches 390; Indels 9; Gaps 2; 

Qy 4 91 CCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGACCGCTGTGAGC 550 

I I I I I I I I II I I I I I I I I I I I I I I Ml I I I I I 

D b 9 CGGGAGCCTGCTACTGCCCTGCTGGGTTCCTTGGGGCCGACTGTAGCCTTGCCTGTCCAC 68 

Qy 551 AGGGCAC CTAT GGT AAC GACT GT CAT C AGAGAT GC CAGT GC CAGAAT GGAG CCACCTGC G 610 

MM II II I MM II II I IN I I I I 

D b 69 AGGGTCGCTTCGGCCCCAGCTGTGCCCACGTGTGTACATGCGGGCAAGGGGCGGCATGTG 128 

Qy 611 ACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTCTGTGAGGATC 670 

II I || Mill Ml I II I I I I I I I I I I M I I M I M I 

D b 129 ACCCAGTGTCGGGGACTTGCATCTGTCCTCCCGGGAAGACGGGAGGCCATTGTGAGCGCG 188 

Qy 671 TTTGTCCTCCTGGTAAACATGGTCCACAGTGTGAGCAGAGATGCCCTTGTCAAAATGGAG 7 30 

| M M I I I M I I II I II I II Ml I II II I 

Db 189 GCTGTCCCCAGGACCGGTTTGGCAAGGGCTGTGAACACAAGTGTGCCTGCAGGAATGGGG 248 

Qy 731 GAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGATGGGCACAGTGT 790 

| II I I I II I Mill I M II I II I I II II I I II I II M I 

Db 24 9 GCCTGTGTCATGCTACCAATGGCAGCTGCTCCTGCCCCCTGGGCTGGATGGGGCCACACT 308 

Qy 7 91 GTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAATGCCAGTGCC 850 

II I || I I I I II I I I II II I I I Ml II II II I I I 

Db 309 GTGAGCACGCCTGCCCTGCTGGGCGCTATGGTGCTGCCTGCCTCCTGGAGTGTTCCTGTC 368 

Qy 851 ATAATGGAGGGACGTGTGATGCTGCCACAGGCCAATGTCATTGCAGTCCAGGATACACAG 910 

Ml III II I I I I I I I M I III II I I I II I I I 

Db 369 AGAACAATGGCAGCTGTGAGCCCACCTCCGGCGCTTGCCTCTGTGGCCCTGGCTTCTATG 428 

Qy 911 GGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGTGCTGAGACCT 970 

III II I I I I I II I I II I MM I M I 

Db 429 GTCAAGCTTGTGAAGACACCTGCCCTGCCGGCTTCCATGGATCTGGTTGCCAGAGAGTTT 488 

Qy 971 G C CAGT GT GT CAAC G GAG GG AAGT GT T AC CAC GT GAG C G GC GCAT GC CT CT GT GAAGC AG 1030 

|| II I I I I I I I MIMI I I I I I I I I I II I II I Ml 

Db 48 9 GCGAGTGTCAACAGGGCGCTCCCTGTGACCCTGTCAGTGGCCGGTGCCTCTGCCCTGCTG 548 

Qy 1031 GCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAAT 1090 

II I I I II I I I I I II I I I I I I I I I I II I 

Db 54 9 GCTTCCGTGGCCAGTTCTGCGAGAGGGG GTGCAAGCCAGGCTTTTTTGGAGATGGCT 605 

Qy 1091 GT GAC AAAC GGTGTCCCT GC C AC T T G GAAAAC AC T C AT AG CT GT CAC C C CAT GT CT G GAG 1150 

I MIMI I I II I I M I II I I I I I I I I M 

Db 606 GCCTGCAGCAGTGTAACTGCCCCACGGGTGTGCC CTGTGATCCCATCAGCGGCC 659 

Qy 1151 AGT GT GCCT GCAAGCCGGGCT GGT CAGGACT CT ACT GT AAT GAGACAT GTT CT CCT GGAT 1210 

II || I I I I I I I II I I I I I I I II M 

Db 660 TCTGCCTTTGCCCACCAGGGCGCGCAGGAACCACATGTGACCTAGATTGCAGAAGAGGCC 719 



Qy 1211 T CTACGGGGAAGCTT GCCAGCAGATCT GCAGCT GCCAAAATGGGGCAGACTGT GACAGT G 1270 



720 GCTTTGGGCCGGGCTGTGCCCTGCGCTGTGATTGTGGGGGTGGGGCTGACTGCGACCCCA 779 

1271 T GACT GGAAAGT GC AC CT GT GC C C C AGGAT T C AAAGGA 1308 
I I I I I I I I I I I I I I I I I I I Ml 

7 80 T C AGT GGGCAGT GC CACT GT GT GGACAGCT ACACGGGA 817 



RESULT 6 

US-09-312-283C-73 

; Sequence 73, Application US/09312283C 

; Patent No. 6573095 

; GENERAL INFORMATION: 

; APPLICANT: Watson, James D. 

; APPLICANT: Strachan, Lorna 

; APPLICANT: Sleeman, Matthew 

; APPLICANT: Onrust, Rene 

; APPLICANT: Murison, James G. 

; APPLICANT: Kumble, Krishanand D. 

; TITLE OF INVENTION: Compositions Isolated from Skin Cells 
; TITLE OF INVENTION: and Methods for Their Use 
; FILE REFERENCE: 11000. 1011c2 

; CURRENT APPLICATION NUMBER: US/09/312, 283C 
; CURRENT FILING DATE: 1999-05-14 
; NUMBER OF SEQ ID NOS : 425 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 73 

LENGTH: 1635 
; TYPE: DNA 
; ORGANISM: Mouse 
US-09-312-283C-73 



Query Match 4.6%; Score 156; DB 4; Length 1635; 

Best Local Similarity 51.2%; Pred. No. 7e-38; 

Matches 419; Conservative 0; Mismatches 390; Indels 9; Gaps 



Qy 


491 


CCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGACCGCTGTGAGC 

1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M IN 1 1 1 1 1 

CGGGAGCCTGCTACTGCCCTGCTGGGTTCCTTGGGGCCGACTGTAGCCTTGCCTGTCCAC 


550 


Db 


9 


68 


Qy 


551 


AGGGCACCT AT GGTAACGACT GT CAT CAGAGAT GCCAGTGCCAGAAT GGAGCCACCT GCG 

MM II II 1 1 1 1 1 II M III 1 1 1 1 1 1 INI 

AGGGTCGCTTCGGCCCCAGCTGTGCCCACGTGTGTACATGCGGGCAAGGGGCGGCATGTG 


610 


Db 


69 


128 


Qy 


611 


AC C AC GT CAC GGGGGAAT GCCGCTGCC CAC CAGGAT ACAC C G GAGC CT T CT GT GAGGAT C 

Ml II 1 II 1 1 IN 1 1 1 1 1 1 1 1 1 1 M MM 1 1 1 1 1 1 1 

ACCCAGTGTCGGGGACTTGCATCTGTCCTCCCGGGAAGACGGGAGGCCATTGTGAGCGCG 


670 


Db 


129 


188 


Qy 


671 


TTTGTCCTCCTGGTAAACATGGTCCACAGTGTGAGCAGAGATGCCCTTGTCAAAATGGAG 

MINI 1 Ml M II 1 II 1 II Ml 1 M II 1 

GCTGTCCCCAGGACCGGTTTGGCAAGGGCTGTGAACACAAGTGTGCCTGCAGGAATGGGG 


730 


Db 


189 


248 


Qy 


731 


GAGTGTGT CAT CAC GT CACT GGAGAATGCTCTTGCCCTTCTGGCTGGATGGGCACAGTGT 

| M II 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 M 1 

GCCTGTGTCATGCTACCAATGGCAGCTGCTCCTGCCCCCTGGGCTGGATGGGGCCACACT 


790 


Db 


249 


308 


Qy 


791 


GTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAATGCCAGTGCC 


850 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



309 



III II I I I I I I I I I I I I I I I I III II I I II III 

GTGAGCACGCCTGCCCTGCTGGGCGCTATGGTGCTGCCTGCCTCCTGGAGTGTTCCTGTC 



851 ATAATGGAGGGACGT GT GATGCTGCCACAGGCCAAT GT CATTGCAGT CCAGGATACACAG 
III III I I I I I I I I I I I I III I I I I I I I I I I 

369 AGAACAATGGCAGCTGTGAGCCCACCTCCGGCGCTTGCCTCTGTGGCCCTGGCTTCTATG 

911 GGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGTGCTGAGACCT 
III I I I I I I Ml! II I III I II I 

429 GTCAAGCTTGTGAAGACACCTGCCCTGCCGGCTTCCATGGATCTGGTTGCCAGAGAGTTT 



368 



910 



428 



970 



488 



971 GCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTCTGTGAAGCAG 1030 

II Mill Mil III Ml I I I I II I I M II I II III 

489 GCGAGTGTCAACAGGGCGCTCCCTGTGACCCTGTCAGTGGCCGGTGCCTCTGCCCTGCTG 548 

1031 GCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAAT 1090 

MM M M M M M M I Ml M II M I 

549 GCTTCCGTGGCCAGTTCTGCGAGAGGGG GTGCAAGCCAGGCTTTTTTGGAGATGGCT 605 

1091 GT G AC AAAC GGT GT C C CT GCCACT T GGAAAAC ACT C AT AGC T GT CAC C C CAT GT CT GGAG 1150 

I MUM I M M I M I MM I Mill M 

606 GCCTGCAGCAGTGTAACTGCCCCACGGGTGTGCC CT GT GAT CC CAT CAGCGGC C 659 

1151 AGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGTTCTCCTGGAT 1210 

M Ml MM I Mill I III I II M 

660 TCTGCCTTTGCCCACCAGGGCGCGCAGGAACCACATGTGACCTAGATTGCAGAAGAGGCC 719 

1211 T CTACGGGGAAGCTT GCCAGCAGAT CT GCAGCT GC CAAAAT GGGGCAGACT GT GAC AGT G 127 0 

II III I M II Ml II II I I II M I I I II I 

720 GCTTTGGGCCGGGCTGTGCCCTGCGCTGTGATTGTGGGGGTGGGGCTGACTGCGACCCCA 77 9 

1271 T GACT G GAAAGT G CAC CT GT G C C C C AG GAT T C AAAG GA 1308 
I I I II Mill Mill I I II III 

780 TCAGTGGGCAGTGCCACTGTGTGGACAGCTACACGGGA 817 



RESULT 7 

US-09-188-930-73 

; Sequence 73, Application US/09188930A 

; Patent No. 6150502 

; GENERAL INFORMATION: 

; APPLICANT: Watson, James D. 

APPLICANT: Strachan, Lorna 
; APPLICANT: Sleeman, Matthew 
; APPLICANT: Onrust, Rene 
; APPLICANT: Murison, James Greg 

; TITLE OF INVENTION: Compositions Isolated From Skin Cells 

; TITLE OF INVENTION: and Methods For Their Use 

; FILE REFERENCE: 11000. lOllcl 

; CURRENT APPLICATION NUMBER: US/09/ 188, 930A 

; CURRENT FILING DATE: 1998-11-09 

; NUMBER OF SEQ ID NOS : 34 8 

; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 73 

LENGTH: 1633 
; TYPE: DNA 

ORGANISM: mouse 



FEATURE : 

NAME/ KEY: unsure 
LOCATION: ( 1608) ... (1608 ) 
US-09-188-930-73 

Query Match 4.5%; Score 155.2; DB 3; Length 1633; 

Best Local Similarity 51.0%; Pred. No. 1.2e-37; 

Matches 417; Conservative 2; Mismatches 390; Indels 9; Gaps 2; 
Qy 491 CCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGACCGCTGTGAGC 550 

I I I I I I I I I I I I I I I I II Ml! I I II I I I II I 

Db 9 CGGGAGCCTGCTACTGCCCTGCTGGGTTCCTTGGGGCCGACTGTAGCCTTGCCTGTCCAC 68 

Qy 551 AGGGCAC CT AT GGTAACGACTGT CAT CAGAGAT GCCAGT GCCAGAAT GGAGCCACCT GCG 610 

II I I MM I M M M M M I I I II M I II I 

Db 69 AGGGTCGCTTCGGCCCCAGCTGTGCCCACGTGTGTACATGCGGGCAAGGGGCGGCATGTG 128 

Qy 611 ACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTCTGTGAGGATC 670 

III II Mill Ml M I M I I M I M II M I M M M 

Db 129 ACCCAGTGTCGGGGACTTGCATCTGTCCTCCCGGGAAGACGGGAGGCCATTGTGAGCGCG 188 

Qy 671 TTTGTCCTCCTGGTAAACATGGTCCACAGTGTGAGCAGAGATGCCCTTGTCAAAATGGAG 730 

Mill I I Ml II I M M I II I M M M I I 

Db 189 GCTGTCCCCAGGACCGGTTTGGCAAGGGCTGTGAACACAAGTGTGCCTGCAGGAATGGGG 24 8 

Qy 731 GAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGATGGGCACAGTGT 790 

I I M M II I Mill M M I II I M I I I M M I M I M I 

Db 249 GCCTGTGTCATGCTACCAATGGCAGCTGCTCCTGCCCCCTGGGCTGKATGGGGCCACACT 308 

Qy 791 GTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAATGCCAGTGCC 850 

III M I M M I I II M II I II III II MM III 

Db 309 GTGAGCACGCCTGCCCTGCTGGGCGCTATGGTGCTGCCTGCCTCCTGGAGTGTTCCTGTC 368 

Qy 851 ATAAT GGAGGGAC GT GT GAT GCT GCCACAGGCCAAT GT CATTGCAGT CCAGGAT ACACAG 910 

Ml Ml Mill I I I I II I II I II I I I I I I I I 

Db 369 AGAACAATGGCAGCTGTGAGCCCACCTCCGGCGCTTGCCTCTGTGGCCCTGGCTTCTATG 428 

Qy 911 GGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGTGCTGAGACCT 970 

III Mill M II II II I M M I II I 

Db 429 GTCAAGCTTGTGAAGACACCTGCCCTGCCGGCTTCCATGGATCTGGTTGCCAGAGAGTTT 4 88 

Qy 971 G C C AGT G T GT C AAC GGAG G G AAGT GT T AC C AC GT GAG CGGCG CAT GCCTCTGT GAAG C AG 1030 

M I M M MM M II M II M M I II M M M Ml 

Db 489 GCGAGTGTCAACAGGGCGCTCCCTGTGACCCTGTCAGTGGCCGGTGCCTCTGCCCTGCTG 54 8 

Qy 1031 GCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAAT 1090 

MM II M II II II II I Ml II M II I 

Db 549 GCTTCCGTGGCCAGTTCTGCGAGAGGGG GTGCAAGCCAGGCTTTTTTGGAGATGGCT 605 

Qy 1091 GT GAC AAAC G GT GT C C CT GC C ACT T GG AAAACACT C AT AGCT GT C AC C C CAT GT CT GGAG 1150 

I I II M I M I I M M I I M I M M M II 

Db 606 GCCTGCAGCAGTGTAACTGCCCCACGGGTGTGCC CTGTGATCCCATCAGCGGCC 659 

Qy 1151 AGT GT GCCT GCAAGCCGGGCT GGT CAGGACT CT ACT GTAAT GAGACAT GTT CT CCT GGAT 1210 

M Ml II M I I I I M I MM II : I 

Db 660 TCTGCCTTTGCCCACCAGGGCGCGCAGGAACCACATGTGACCTAGATTGCAGAAGARGCC 719 



Qy 1211 TCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGACTGTGACAGTG 1270 

II | M I II II I I I M II I I II Mill IN 

Db 720 GCTTTGGGCCGGGCTGTGCCCTGCGCTGTGATTGTGGGGGTGGGGCTGACTGCGACCCCA 779 

Qy 1271 TGACTGGAAAGTGCACCTGTGCCCCAGGATTCAAAGGA 1308 



780 TCAGTGGGCAGTGCCACTGTGTGGACAGCTACACGGGA 817 



RESULT 8 

US-09-833-381-1076 

; Sequence 1076, Application US/09833381 

; Patent No. 6672186 

; GENERAL INFORMATION: 

; APPLICANT: Robison, Keith E. 

; TITLE OF INVENTION: No. 6672186el Nucleic Acid and Protein Homologs 
; FILE REFERENCE: 5800-119 

; CURRENT APPLICATION NUMBER: US/09/833, 381 

; CURRENT FILING DATE: 2001-04-11 

; PRIOR APPLICATION NUMBER: 09/516,448 

; PRIOR FILING DATE: 2000-02-29 

; NUMBER OF SEQ ID NOS : 2050 

; SOFTWARE: FastSEQ for Windows Version 3.0 

; SEQ ID NO 1076 

LENGTH: 393 

TYPE: DNA 
; ORGANISM: Homo sapiens 
US-09-833-381-1076 

Query Match 3.9%; Score 134.4; DB 4; Length 393; 

Best Local Similarity 61.9%; Pred. No. 1.4e-31; 

Matches 213; Conservative 0; Mismatches 131; Indels 0; Gaps 0; 

Qy 7 81 GGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAA 84 0 

II I I I I I I I I I I I I I I I I I I I II I I I II I I I I II I I I I I I 

Db 28 GGAGCAGTGTGTGCCCAGCCCTGCCCACCAGGGACATTTGGCCAGAACTGCAGCCAGGAT 87 

Qy 841 T GC CAGT GC CAT AAT GGAGGGAC GT GT GAT GCT G C CACAG GC CAAT GT CATT GCAGT CCA 900 

III I I I I I I I I I I I I I I I I I I I I II M II II II II I I 

Db 88 T GT CCTT GCCACCATGGAGGGCAGTGT GACCACGT GACT GGACAGT GC C ACT GTACAGCT 147 

Qy 901 GGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml II I I I I I I II 

Db 148 GGATACATGGGGGACAGGTGCCAAGAGGAGTGCCCCTTCGGGTCCTTCGGCTTCCAGTGC 2 07 

Qy 961 GCT GAGAC CT GC C AGT GT GT C AAC G GAGG GAAGT GT T AC C AC GT GAGC GGC GCAT G C CT C 1020 

I II I II III I I I I I I I I I II II I II I I I I I I I I 

Db 208 TCACAGCGCTGTGACTGCCACAATGGGGGGCAGTGTTCACCCACCACGGGTGCCTGCGAG 267 

Qy 1021 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 1080 

Mill Mill Ml I I I I II I I II II I I I II Mill M I 

Db 268 TGTGAGCCTGGCTACAAGGGCCCACGCTGCCAGGAGCGACTGTGCCCGGAGGGCCTGCAT 327 

Qy 1081 GGC AT C AAAT GT GACAAAC GGT GT C C C T GC CAC T T GGAAAAC AC 1124 

Ml II I I M II I II M II II I II 

Db 328 GGCCCAGGCTGCACCCTGCCCTGCCCCTGTGACGCTGACAACAC 371 



RESULT 9 

US-09-621-976-1192 

; Sequence 1192, Application US/09621976 
/ Patent No. 6639063 
; GENERAL INFORMATION: 

; APPLICANT: Duiuas Milne Edwards , J.B. 

; APPLICANT: Jobert, S. 

; APPLICANT: Giordano, J.Y. 

; TITLE OF INVENTION: ESTs and Encoded Human Proteins. 

; FILE REFERENCE: GENSET . 054PR2 

; CURRENT APPLICATION NUMBER: US/09/621,976 

; CURRENT FILING DATE: 2000-07-21 

; NUMBER OF SEQ ID NOS : 19335 

SOFTWARE: Patent. pm 
; SEQ ID NO 1192 

LENGTH: 553 

TYPE: DNA 
; ORGANISM: Homo sapiens 

FEATURE: 

NAME/ KEY: CDS 

LOCATION: 331.. 552 
US-09-621-976-1192 

Query Match 1.9%; Score 66.2; DB 4; Length 553; 

Best Local Similarity 60.1%; Pred. No. 3.6e-10; 

Matches 110; Conservative 0; Mismatches 73; Indels 0; Gaps 0; 

Qy 2 996 ACAGAAGCTAT AT GGGAAAAT CCTT AAAAGAC CT GGGAAAGAATT CT GAATAT AATT CAA 3055 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 

Db 347 ACACATACAT TAT GGACAAAGGC T T CAAAGAT TACAT GAAAGAAT C C GT GT GCAGTT C TA 4 06 

Qy 3056 G T AACT GCT C C CT AAG C AGT T C T GAGAAC C CAT AT G C C AC TAT T AAAGAC C C AC CT GT AC 3115 

III II II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 407 GT ACT T GT T C C T T G AAT AGC AGT GAAAAC C C T T AC G C C AC AAT T AAG G AC C C AC C CAT C C 466 

Qy 3116 T TAT CCC GAAAAGCT CAGAGT GT GGT TAT GT GGAGAT GAAAT CGC C GGCAC GAAGAGAT T 3175 

III II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 467 TCACCTGCAAGCTTCCAGAAAGCAGCTATGTAGAAATGAAGTCGCCTGTGCACATGGGGT 526 

Qy 3176 CCC 3178 

t I 

Db 527 CTC 529 



RESULT 10 
US-08-323-474-1 

Sequence 1, Application US/08323474 
Patent No. 5447860 
GENERAL INFORMATION: 

APPLICANT: Ziegler, Steven F. 
TITLE OF INVENTION: NOVEL TYROSINE KINASE 
NUMBER OF SEQUENCES: 8 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Immunex Corporation 
STREET: 51 University Street 
CITY: Seattle 



STATE: Washington 
COUNTRY: US 
ZIP: 98101 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS o 
SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/323, 474 
FILING DATE: 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/905,600 
FILING DATE: 26-JUN-1992 
ATTORNEY/ AGENT INFORMATION: 
NAME: Seese, Kathryn A. 
REGISTRATION NUMBER: 32,172 
REFERENCE/ DOCKET NUMBER: 2609 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (206) 587-0430 
TELEFAX: (206) 233-0644 
TELEX: 756822 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 4138 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA to inRNA 
FEATURE: 

NAME /KEY: CDS 
LOCATION: 149. . 3523 
US-08-323-474-1 

Query Match 1.6%; Score 55.2; DB 1; Length 4138; 

Best Local Similarity 60.8%; Pred. No. 4e-06; 

Matches 90; Conservative 0; Mismatches 58; Indels 0; Gaps 0; 
Qy 1131 CTGTCACCCCATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAA 1190 

I I I I I I I I I I Mil Mil I M M M MM I M M 

Db 985 CTGTCTCCCTGACCCCTATGGGTGTTCCTGTGCCACAGGCTGGAAGGGTCTGCAGTGCAA 1044 

Qy 1191 T GAGACAT GTT CT C CT GGATT CT AC G GG GAAGCTT G CC AGCAGAT CT GCAGCT GC CAAAA 1250 

Ml I M I I I I I I I I I M M I M I M Ml I I I I II M I I I I I 

Db 1045 TGAAGCATGCCACCCTGGTTTTTACGGGCCAGATTGTAAGCTTAGGTGCAGCTGCAACAA 1104 

Qy 1251 TGGGGCAGACT GT GACAGT GT GACTGGA 1278 

Mill I I I II I I III 

Db 1105 T GG G GAGAT GT GT GAT CGCTT C CAAGGA 1132 



RESULT 11 
PCT-US93-06093-1 

; Sequence 1, Application PC/TUS9306093 
; GENERAL INFORMATION: 

APPLICANT: Ziegler, Steven F. 



TITLE OF INVENTION: NOVEL TYROSINE KINASE 
; NUMBER OF SEQUENCES: 3 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Immunex Corporation 

; STREET: 51 University Street 

CITY: Seattle 
; STATE: Washington 

COUNTRY: US 
; ZIP : 98101 

COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 
; OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: PCT/US93/06093 

FILING DATE: 19930625 

CLASSIFICATION: 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/905,600 
; FILING DATE: 26-JUN-1992 

ATTORNEY/AGENT INFORMATION: 
; NAME: Seese, Kathryn A. 

REGISTRATION NUMBER: 32,172 
; REFERENCE/ DOCKET NUMBER: 2609 

TELECOMMUNICATION INFORMATION: 
; TELEPHONE: (206) 587-0430 

TELEFAX: (206) 233-0644 
; TELEX: 756822 

; INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 4138 base pairs 
; TYPE: NUCLEIC ACID 

STRANDEDNESS: single 
; TOPOLOGY: linear 

MOLECULE TYPE: cDNA to mRNA 
FEATURE : 

NAME /KEY: CDS 

LOCATION: 149.. 3523 
PCT-US93-06093-1 

Query Match 1.6%; Score 55.2; DB 5; Length 4138; 

Best Local Similarity 60.8%; Pred. No. 4e-06; 

Matches 90; Conservative 0; Mismatches 58; Indels 0; Gaps 0; 

Qy 1131 CTGTCACCCCATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAA 1190 

Mill I II I I I I I I I I I I I I I I I I I I I I I Mill 

Db 985 CTGTCTCCCTGACCCCTATGGGTGTTCCTGTGCCACAGGCTGGAAGGGTCTGCAGTGCAA 1044 

Qy 1191 T GAGAC AT GT T C T C C T GGAT T C T AC GGGGAAGC T T GC C AGC AGAT C T GCAGC T GC C AAAA 1250 

Ml I I I I I I II I I I II II I I I I I I I III I II I I I I I M I I I 

Db 1045 TGAAGCATGCCACCCTGGTTTTTACGGGCCAGATTGTAAGCTTAGGTGCAGCTGCAACAA 1104 

Qy 1251 T GGGGCAGACT GT GACAGTGT GACT GGA 1278 

Mill Mill I I III 

Db 1105 T G G GGAGAT GT GT GAT C G CTT CCAAGGA 1132 



RESULT 12 
US-08-220-240A-4 

Sequence 4, Application US/08220240A 
Patent No. 5955291 
GENERAL INFORMATION: 

APPLICANT: Alitalo, Kari 
APPLICANT: Matikainen, Marja-Terttu 
APPLICANT: Partanen, Juha 
APPLICANT: Makela, Tomi 
APPLICANT: Korhonen, Jaana 

TITLE OF INVENTION: ANTIBODIES RECOGNIZING TIE RECEPTOR 
TITLE OF INVENTION: TYROSINE KINASE AND USES THEREOF 
NUMBER OF SEQUENCES: 5 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Marshall, O'Toole, Gerstein, Murray & Borun 
STREET: 233 South Wacker Drive/6300 Sears Tower 
CITY: Chicago 
STATE: Illinois 

COUNTRY: Unites States of America 
ZIP: 60606-6402 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US /0 8/220, 24 OA 
FILING DATE: 29-MAR-1994 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: PCT/FI93/00006 
FILING DATE: 08-JAN-1993 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/817,800 
FILING DATE: 09-JAN-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/167,453 
FILING DATE: 15-DEC-1993 
ATTORNEY/AGENT INFORMATION: 
NAME: Gass, David A. 
REGISTRATION NUMBER: 38,153 
REFERENCE/ DOCKET NUMBER: 29151/31958 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (312) 474-6300 
TELEFAX: (312) 474-0448 
INFORMATION FOR SEQ ID NO: 4: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 3845 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME /KEY: CDS 
LOCATION: 37. .3450 
US-08-220-240A-4 



Query Match 1.6%; Score 54; DB 2; Length 3845; 

Best Local Similarity 51.4%; Pred. No. 9e-06; 

Matches 179; Conservative 0; Mismatches 160; Indels 9; Gaps 2; 

Qy 1709 GCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGC TACTGTAAAA 1765 

I I I I I I I I I I I I I I I I I I I I I I I I I III II I 

Db 674 GGGGTTGTGGGGCTGGGCGCTGGGGGCCAGGCTGTACCAAGGAGTGCCCAGGTTGCCTAC 733 

Qy 1766 ATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCACCAGGCTTCCGAGGCA 1825 

I I I I I I III I I I I I I I I III II I I I I I I I I I I I I I 

Db 734 ATGGAGGTGTCTGCCACGACCATGACGGCGAATGTGTATGCCCCCCTGGCTTCACTGGCA 793 

Qy 1826 CCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGCCAGACATGCC 1885 

II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 

Db 794 CCCGCTGTGAACAGGCCTGCAGAGAGGGCCGTTTTGGGCAGAGCTGCCAGGAGCAGTGCC 853 

Qy 1886 CACAGTGCGTTCACAGCAGCGGGCCCTGCCACCACATCACCGGCCTGTGTGACTGCTTGC 1945 

II I I I I I I I I I I I I I I I I I I II Mill 

Db 854 CAGGCATATCAGGCTGCCGGGGCCTCACCTTCTGCCTCCCAGACCCCTATGGCTGCTCTT 913 

Qy 1946 CTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCAGATTTGGGA 1999 

MINI MM II II I I II I I I II II III II I I I I 

Db 914 GTGGATCTGGCTGGAGAGGAAGCCAGTGCCAAGAAGCTTGTGCCCCTGGTCATTTTGGGG 973 

Qy 2000 AAAACTGTGCAGGAATTT GTAC CTGCACCAACAAC GGAAC CT GTAAC C 2047 

III I II M II II II II I I I I 

Db 974 CT GATT GCC GACT CCAGT GC CAGT GT CAGAAT GGT GG CACTT GT GAC C 1021 



RESULT 13 
US-07-934-393B-1 

; Sequence 1, Application US/07934393B 
; Patent No. 5466596 
; GENERAL INFORMATION: 

APPLICANT: BREITMAN, MARTIN L. 

APPLICANT: DUMONT, DANIEL 

APPLICANT: GRADWOHL, GERARD G. 

TITLE OF INVENTION: TISSUE SPECIFIC TRANSCRIPTIONAL 
; TITLE OF INVENTION: REGULATORY ELEMENT 
NUMBER OF SEQUENCES: 5 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: BERESKIN & PARR 
; STREET: 40 King Street West 

CITY: Toronto 
; STATE: Ontario 

; COUNTRY: Canada 

ZIP: M5H 3Y2 
; COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
; COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 
; SOFTWARE: Patentln Release #1.0, Version #1.25 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07/934 , 393B 
FILING DATE: 25-AUG-1992 
CLASSIFICATION: 435 



ATTORNEY/ AGENT INFORMATION: 
NAME: Kurdydyk, Linda M. 
REGISTRATION NUMBER: 34,971 
REFERENCE/ DOCKET NUMBER: 3153-64 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (416) 354-7311 
TELEFAX: (416) 361-1398 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 4175 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
HYPOTHETICAL: NO 
ANTI-SENSE: NO 
FRAGMENT TYPE: N- terminal 
ORIGINAL SOURCE: 

ORGANISM: Mus pahari 
STRAIN: CD-I 

DEVELOPMENTAL STAGE: Embryo 
TISSUE TYPE: Heart 
IMMEDIATE SOURCE: 

CLONE: tek 
POSITION IN GENOME: 

CHROMOSOME/ SEGMENT : 4 

MAP POSITION: Between the brown and pmv-23 loci 
FEATURE : 

NAME/ KEY: CDS 
LOCATION: 124, .3477 

OTHER INFORMATION: /function= "putative transmembrane 
OTHER INFORMATION: receptor" 

OTHER INFORMATION: /product= "tyrosine kinase" 
OTHER INFORMATION: /gene= "tek" 

OTHER INFORMATION: /standard_name= "tyrosine kinase receptor protein" 
US-07-934-393B-1 

Query Match 1.6%; Score 54; DB 1; Length 4175; 

Best Local Similarity 50.6%; Pred. No. 9.6e-06; 

Matches 166; Conservative 0; Mismatches 150; Indels 12; Gaps 1; 

Qy 716 CT T GT CAAAAT GGAGGAGT GT GT CAT C AC GT CACT GGAGAAT GCT CTT GCC CTT CT GGCT 77 5 

I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 8 00 CTTGCAAGAACAATGGAGTCTGCCATGAAGATACCGGGGAATGCATTTGCCCTCCTGGGT 859 

Qy 77 6 GGATGGGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCC 835 

Mill I I I II I I I II I I I I I I I I I I I I I I 

Db 8 60 T TAT GGG GAGAACAT GT GAGAAAGCT T GT GAGC CGC ACACAT T T GGCAGGAC CT GT AAAG 919 

Qy 836 AAGAATGCCAGTGCCATAATGGAGGGACGTGTGATG CTGCCACAGGCC 883 

II II II I I II I I I I I I I I I I I I I 

Db 920 AAAGGT GTAGT G GAC CAGAAG GAT GCAAGT CT T AT GT GT T CT GT CT C C CAGAC CCT T AC G 979 

Qy 8 84 AAT GT CATT GCAGT CCAGGAT ACACAGGGGAACGGT GCCAGGAT GAGT GT CCT GTT GGGA 943 

I M I I I I I I I I M I I I I I I I I I I I I I I III 

Db 980 GGTGTTCCTGTGCCACAGGCTGGAGGGGGTTGCAGTGCAATGAAGCATGCCCATCTGGTT 1039 



Qy 



Db 



1040 



944 



CCTATGGCGTTCTCTGTGCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACG 1003 

I I I I I I I I I I I I I I I I I I I I I I I I I I Mill 

ACTACGGACCAGACTGTAAGCTCAGGTGCCACTGTACCAATGAAGAGATATGTGATCGGT 1099 



Qy 



1004 



TGAGCGGCGCATGCCTCTGTGAAGCAGG 1031 



Db 



1100 



TCCAAGGATGCCTCTGCTCTCAAGGATG 1127 



RESULT 14 
US-08-278-089A-1 

; Sequence 1, Application US/08278089A 

; Patent No. 5681714 

; GENERAL INFORMATION: 

; APPLICANT: Breitman, Martin L. 

; APPLICANT: Rossant, Janet 

APPLICANT: Dumont, Daniel J. 
; APPLICANT: Yamaguchi, Terry P. 

TITLE OF INVENTION: No. 5681714el Receptor Tyrosine Kinase 

NUMBER OF SEQUENCES: 33 

CORRESPONDENCE ADDRESS: 

ADDRESSEE: Bereskin & Parr 
STREET: 4 0 King Street West 
; CITY: Toronto 

; STATE: Ontario 

; COUNTRY: Canada 

; ZIP: M5H 3Y2 

COMPUTER READABLE FORM: 
; MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS /MS -DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/278, 089A 

FILING DATE: 20-JUL-1994 

CLASSIFICATION: 530 
ATTORNEY/AGENT INFORMATION: 
; NAME: Kurdydyk, Linda M. 

; REGISTRATION NUMBER: 34,971 

REFERENCE/ DOCKET NUMBER: 3153-111 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (416) 364-7311 
; TELEFAX: (416) 361-1398 

; INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 4175 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
HYPOTHETICAL: NO 
ANTI-SENSE: NO 
FRAGMENT TYPE: N-terminal 
ORIGINAL SOURCE: 

ORGANISM: Mus raus cuius 
; STRAIN: CD-I 

DEVELOPMENTAL STAGE: Embryo 



TISSUE TYPE: Heart 
IMMEDIATE SOURCE: 

CLONE: Tek 
POSITION IN GENOME: 

CHROMOSOME/ SEGMENT : 4 

MAP POSITION: Between the brown and pmv-23 loci 
FEATURE : 

NAME/ KEY: CDS 
LOCATION: 124. .3478 
US-08-278-089A-1 

Query Match 1.6%; Score 54; DB 1; Length 4175; 

Best Local Similarity 50.6%; Pred. No. 9.6e-06; 

Matches 166; Conservative 0; Mismatches 150; Indels 12; Gaps 1; 

Qy 716 CTTGTCAAAATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCT 775 

I I I I III I II I I I I I I I I I I I I I I I I I I I Ml I I I I 

Db 800 CTT GCAAGAACAAT GGAGTCT GCCATGAAGATACCGGGGAAT GCATTT GCCCTCCTGGGT 8 59 

Qy 776 GGATGGGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCC 835 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

Db 8 60 TTAT GGGGAGAACAT GT GAGAAAGCTT GT GAGCCGCACACATTTGGCAGGAC CTGTAAAG 919 

Qy 836 AAGAAT G C CAGT GC C AT AAT GGAGGGAC GT GT GAT G CTGCCACAGGCC 8 83 

II II II I I I I I I I I I I I I I I I I I 

Db 920 AAAGGTGTAGTGGACCAGAAGGATGCAAGTCTTATGTGTTCTGTCTCCCAGACCCTTACG 979 

Qy 884 AAT GT CATT GCAGT C CAGGAT ACACAGG GGAACGGTGC CAG GAT GAGT GT CCT GT T GG GA 943 

Ml II I I I I I I III I I I I I I I I I M II M I 

Db 98 0 GGTGTTCCTGTGCCACAGGCTGGAGGGGGTTGCAGTGCAATGAAGCATGCCCATCTGGTT 1039 

Qy 944 CCTATGGCGTTCTCTGTGCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACG 1003 

I I I I I I I I I I I II II I I I I I I I I I I I I I I I I 

Db 1040 ACT AC G G AC C AGACT GT AAGCT C AGGT G C CACT GT AC C AAT GAAGAGAT AT GT GAT C G GT 1099 

Qy 1004 TGAGCGGCGCATGCCTCTGTGAAGCAGG 1031 

I II I II I I I I I I 

Db 1100 T CCAAGGAT GC CT CT G CT CT CAAG GAT G 1127 



RESULT 15 
US-08-838-957A-1 

; Sequence 1, Application US/08838957A 

; Patent No. 5998187 

; GENERAL INFORMATION: 

; APPLICANT: Breitman, Martin L. 

; APPLICANT: Rossant, Janet 

APPLICANT: Dumont, Daniel J. 

APPLICANT: Yamaguchi, Terry P. 

TITLE OF INVENTION: No. 5998187el Receptor Tyrosine Kinase 
; NUMBER OF SEQUENCES: 32 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Bereskin & Parr 

; STREET: 4 0 King Street West 

; CITY: Toronto 

STATE: Ontario 
COUNTRY: Canada 



; ZIP: M5H 3Y2 

; COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 
; OPERATING SYSTEM: PC-DOS/MS-DOS 

; SOFTWARE: Patentln Release #1.0, Version #1.30 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/838 , 957A 
FILING DATE: 23-APR-1997 
CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 
; NAME: Kurdydyk, Linda M. 

REGISTRATION NUMBER: 34,971 
REFERENCE/DOCKET NUMBER: 3153-212 
; TELECOMMUNICATION INFORMATION: 
TELEPHONE: (416) 364-7311 
TELEFAX: (416) 361-1398 
; INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 4175 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
; HYPOTHETICAL: NO 
ANTI-SENSE: NO 
FRAGMENT TYPE: N-terminal 
ORIGINAL SOURCE: 
; ORGANISM: Mus mus cuius 

; STRAIN: CD-I 

DEVELOPMENTAL STAGE: Embryo 
TISSUE TYPE: Heart 
IMMEDIATE SOURCE: 

CLONE: Tek 
POSITION IN GENOME: 

CHROMOSOME/ SEGMENT : 4 

MAP POSITION: Between the brown and pmv-23 loci 
FEATURE: 
; NAME/KEY: CDS 

; LOCATION: 124.. 3478 

US-08-838-957A-1 

Query Match 1.6%; Score 54; DB 2; Length 4175; 

Best Local Similarity 50.6%; Pred. No. 9.6e-06; 

Matches 166; Conservative 0; Mismatches 150; Indels 12; Gaps 1; 

Qy 716 CTTGTCAAAATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCT 775 

I I I I III I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

Db 800 CTT GCAAGAACAAT GGAGT CT GC CAT GAAGAT AC C GG GGAAT GC AT TT GC C CT CCT GG GT 859 

Qy 776 GGATGGGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCC 835 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I 

Db 860 TTAT GGGGAGAACATGT GAGAAAGCTT GTGAGCCGCACACATTT GGCAGGACCT GTAAAG 919 

Qy 836 AAGAAT GCCAGT GCCAT AAT GGAGGGAC GT GT GAT G CTGCCACAGGCC 8 83 

II I I II I I I I I I I I I I I I I I I I I 

Db 920 AAAGGTGTAGTGGACCAGAAGGAT GCAAGT CTTAT GTGTTCT GT CTCCCAGACCCTT ACG 979 



Qy 884 AAT GT CATT G CAGT C CAGGAT ACAC AGGG GAACGGT GC CAG GAT GAGT GT C CTGT T GGGA 943 

III II II II I I III I I I I I I I I I I I I I III 

Db 980 GGTGTTCCTGTGCCACAGGCTGGAGGGGGTTGCAGTGCAATGAAGCATGCCCATCTGGTT 1039 

Qy 944 CCTATGGCGTTCTCTGTGCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACG 1003 

Mill II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 104 0 ACT AC GGACC AGACT GTAAGCT CAGGT G C C ACT GT ACCAAT GAAG AGAT AT GT GAT CGGT 1099 

Qy 1004 TGAGCGGCGCATGCCTCTGTGAAGCAGG 1031 

I II I I I I I I I I I 

Db 1100 TCCAAGGATGCCTCTGCTCTCAAGGATG 1127 



Search completed: March 30, 2004, 08:37:02 
Job time : 2 69 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on: March 30, 2004, 02:15:41 ; 



Search time 1109 Seconds 

(without alignments) 

11491.309 Million cell updates/sec 



Title: 

Perfect score: 
Sequence: 



US-10-092-390-1 
3423 

1 atggttatttctttgaactc gcagcagcagcagtgaatga 3423 



Scoring table: IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

Searched: 2458946 seqs, 1861504846 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



4917892 



Database 



Published_Applications_NA: * 
1: /cgn2__6/ptodata/2/pubpna/US07_PUBCOMB.seq: * 
2 : /cgn2_6/ptodata/2/pubpna/PCT_NEW_PUB. seq: * 
3: /cgn2_6/ptodata/2/pubpna/US06_NEW_PUB.seq: * 
4: /cgn2_6/ptodata/2/pubpna/US06_PUBCOMB.seq: * 
5 : /cgn2_6/ptodata/2/pubpna/US07_NEW_PUB. seq: * 
6: /cgn2_6/ptodata/2/pubpna/PCTUS_PUBCOMB.seq:* 
7 : / cgn2_6/ptoda ta/2 /pubpna/US 0 8_NEW_PUB .seq:* 
8 : /cgn2_6/ptodata/2/pubpna/US08_PUBCOMB. seq: * 
9: /cgn2_6/ptodata/2/pubpna/US09A_PUBCOMB.seq: * 
10: /cgn2_6/ptodata/2/pubpna/US09B_PUBCOMB.seq:* 
11 : /cgn2_6/ptodata/2/pubpna/US09C_PUBCOMB. seq: * 
12: /cgn2_6/ptodata/2/pubpna/US09_NEW_PUB.seq:* 
13: /cgn2_6/ptodata/2/pubpna/US10A_PUBCOMB.seq: * 
14: /cgn2_6/ptodata/2/pubpna/US10B_PUBCOMB. seq:* 
15: /cgn2_6/ptodata/2/pubpna/US10C_PUBCOMB.seq:* 
16: /cgn2_6/ptodata/2/pubpna/US10_NEW_PUB.seq: * 
17: /cgn2j5/ptodata/2/pubpna/US60_NEW_PUB. seq:* 
18 : /cgn2_6/ptodata/2/pubpna/US60_PUBCOMB. seq: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
US-10-092-390-1 

; Sequence 1, Application US/10092390 
; Publication No. US20030013865A1 
; GENERAL INFORMATION: 
; APPLICANT: Yu, Xuanchuan 



; APPLICANT: Miranda, Maricar 

; TITLE OF INVENTION: No. US20030013865Alel Human EGF- Family Proteins and 
Polynucleotides Encoding the Same 
; FILE REFERENCE: LEX-0317-USA 

CURRENT APPLICATION NUMBER: US/10/092 , 390 
; CURRENT FILING DATE: 2002-03-06 
; PRIOR APPLICATION NUMBER: US 60/275,013 
; PRIOR FILING DATE: 2001-03-12 
; NUMBER OF SEQ ID NOS : 4 

; SOFTWARE : FastSEQ for Windows Version 4.0 
; SEQ ID NO 1 

LENGTH: 3423 

TYPE: DNA 
; ORGANISM: homo sapiens 
US-10-092-390-1 

Query Match 100.0%; Score 3423; DB 14; Length 3423; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 3423; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 60 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 

Db 1 ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 60 

Qy 61 GGGACAGCAT CAC CT CT GAAT CT T GAAGAC CCTAAT GT GT GT AGCCACT GGGAAAGCTAC 120 

I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 GGGACAGCAT CAC CT CT GAAT CT T GAAGAC C CT AAT GTGT GT AGCCACT GGGAAAGCTAC 12 0 

Qy 121 T CAGT GACT GT G C AAGAGT CAT AC C CAC AT C C C T TT GAT C AAAT T T AC T AC AC GAGCT GC 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 T CAGT GACT GT GC AAGAGT CAT AC C CAC AT C C C T TT GAT C AAAT T TACT AC AC GAGCT G C 18 0 

Qy 181 ACTGACATTCTAAACTGGTTTAAATGCACGCGGCACAGAGTCAGCTATCGGACAGCCTAT 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 ACT GACATT CT AAACT GGT T T AAAT G CAC GC G G CACAGAGT CAG CT AT C GGAC AGC CTAT 24 0 

Qy 241 C GACAT GGGGAGAAGACT ATGT AT AG GCGCAAGT CT CAGT GTT GT C CT GGATTTT AT GAA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 C GACAT GG GGAGAAGACT AT GT AT AG GCGCAAGT CT CAGT GTT GT C CT GGATTTT AT GAA 300 

Qy 301 AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCT 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCT 360 

Qy 361 CCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGAT 42 0 

I I I I I I I I I I I II I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 

Db 361 CCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGAT 42 0 

Qy 421 GGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGC 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 GGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGC 480 

Qy 481 AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 54 0 



Qy 



541 C G CT GT GAGC AG G GCAC CTAT GGT AAC GACT GT CAT CAG AGAT GC C AGT G C C AGAAT GGA 600 



Db 541 C GCT GT GAG CAGGGCAC CTAT G GT AAC G ACT GTC AT CAGAGAT GC CAGT GC C AGAATGGA 600 

Qy 601 G C C AC CT GC GAC C AC GT C AC G GGG G AAT GCCGCTGCC CAC C AG GAT AC AC C GGAGC CT T C 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I i I 

Db 601 GCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTC 660 

Qy 661 T GT GAGGAT CT T T GT CCT C CT GGT AAACAT GGT CC ACAGT GT GAGCAGAGAT GC C CTT GT 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 661 T GT GAGGAT CTTTGT CCT CCT GGT AAACATGGTCC ACAGT GT GAGCAGAGAT GCCCTT GT 720 

Qy 721 CAAAATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGATG 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 721 C AAAAT G GAGGAGT GT GT CAT CAC GT CACT GGAGAAT GCT CT T GC CCTT CT GGCT GGAT G 780 

Qy 7 81 GGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAA 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 781 GGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAA 84 0 

Qy 841 TGCCAGT GCCATAAT GGAGGGAC GTGT GAT GCT GC CACAGGCCAATGT CATT GCAGT CCA 900 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 841 TGCCAGTGCCATAATGGAGGGACGTGTGATGCTGCCACAGGCCAATGTCATTGCAGTCCA 900 

Qy 901 GGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 901 GGATACACAG GGGAAC GGT GCCAGGATGAGTGTCCTGTTGG GAC CTAT GGCGTTCTCTGT 960 

Qy 961 GCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

Db 961 GCT GAGAC CT GC CAGT GT GT C AAC G GAGGGAAGT GT TAC CAC GT GAGC GGC GCAT GCCT C 1020 

Qy 1021 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1021 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 1080 

Qy 1081 GGCAT CAAAT GT GAC AAAC GGTGTCCCTGC CACT T GGAAAAC ACT CAT AG C T GT C AC C C C 114 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

Db 1081 G G CAT CAAAT GT GAC AAAC GGTGTCCCTGC CACT T G G AAAAC ACT CAT AG C T GT CAC C C C 114 0 

Qy 1141 AT GT CT GGAGAGT GT GCCT GC AAGC CGGGCTGGT C AGGACT C TAC T GT AAT GAGAC AT GT 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1141 ATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGT 1200 

Qy 1201 TCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGAC 12 60 

I I 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 I I I 1 1 1 i 1 1 1 1 1 1 1 I I I 1 1 1 I I I 1 1 1 I I I I I 1 1 1 1 1 1 1 I I 

Db 1201 TCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGAC 12 60 

Qy 12 61 T GTGACAGT GT GACT GGAAAGT GCACCT GTGC CCCAGGATT CAAAGGAATT GACTGCT CT 1320 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1261 T GT GACAGT GT GACT GGAAAGT GCAC CT GT GC C C C AGGAT TCAAAG GAAT T GACT GCT C T 1320 

Qy 1321 ACCCCATGCCCTCTGGGAACCTATGGGATAAACTGT1CCTCTCGCTGTGGCTGTAAAAAT 1380 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1321 ACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTA7WUVT 1380 

Qy 1381 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 144 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1381 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 14 4 0 

1441 GACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAG 1500 

1 1 1 i 1 1 1 1 1 1 I 1 1 1 1 1 i 1 1 1 I I 1 1 1 1 1 I I I I I I I I I 1 1 I I I I I I I I I I I I I I I 

1441 GACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTT^ACTTAACATGCCAG 1500 

1501 TGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 1560 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I 

1501 TGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 1560 

1561 CGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAG 1620 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1561 CGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAG 1620 

1621 CGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTC 1680 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1621 CGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTC 168 0 

1681 CCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAAC 1740 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1681 CCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAAC 174 0 

-1741 TGCTCCCT.GCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGC 18 00 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 

1741 TGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGC 1800 

1801 GAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTAT 1860 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1801 GAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTAT 1860 

18 61 GGGC AT C G CT G C AG C C AGACAT G C C C AC AGT GC GT T C AC AGC AGC GGGC C CT G C C AC C AC 1920 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1861 GGGCATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCAC 1920 

1921 ATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGT 198 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1921 ATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGT 198 0 

1981 C C CAGT GGCAGAT T T GGGAAAAACT GT GC AGGAATT T GT ACCT GCACCAACAAC GGAAC C 2040 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1981 C C CAGT GG C AGAT T T GGGAAAAACT GT GCAGGAATTT GT ACCT GCAC CAACAAC GGAAC C 2040 

2041 TGTAACCCCATTGACAGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCT 2100 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2041 TGTAACCCCATTGACAGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCT 2100 

2101 C AAC CAT GT C CAC CT GC CC ACT GGG G C C CAAAC T GC AT C C AC AC GT GC AACT G C C AT AAT 2160 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2101 CAACCATGTCCACCTGCCCACTGGGGCCC7WVCTGCATCCACACGTGCAACTGCCATAAT 2160 

2161 GGAGCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTC 2220 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2161 GGAGCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTC 2220 

2221 TACTGCACT CAGAGATGTCCTCT AGGGTTTTAT GGAAAAGATT GTGCACT GAT ATGCCAA 2280 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

2221 TACT GCACT CAGAGAT GTC CT CTAGGGTT T T AT GGAAAAGAT T GT G CAC T GAT AT GC CAA 2280 



Qy 


2281 


Db 


2281 


Qy 


2341 


Db 


2341 


Qy 


2401 


Db 


2401 


Qv 


2461 


Db 


2461 


Qy 


2521 


Db 


2521 


Qy 


2581 


Db 


2581 


Qy 


2641 


Db 


2641 


Qv 


2701 


Db 


2701 


Qy 


2761 


Db 


2761 


Qy 


2821 


Db 


2821 


Qy 


2881 


Db 


2881 


Qy 


2941 


Db 


2941 


Qy 


3001 


Db 


3001 


Qy 


3061 


Db 


3061 



TGTCAAAACGGAGCTGACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTC 2340 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I 

TGTCAAAACGGAGCTGACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTC 234 0 

ATGGGACGGCACTGTGAGCAGAAGTGCCCTTCAGGAACATATGGCTATGGCTGTCGCCAG 2400 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 

AT GGGAC G GC ACT GT GAG CAGAAGT GCC CT T CAGGAAC AT AT G GCT AT GG CT GT C GC CAG 24 00 

AT AT GT GATT GT CT GAACAACT CCACCT GCGACCACAT CACT GGGACCT GTTACT GCAGC 2460 
| | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II II M I I M I I I I I I I I 
AT AT GT GAT T GT CT GAACAACT C CAC CT GC GAC CACAT CACT GGGAC CT GTTACT GCAG C 2460 

C C C GGAT GGAAGG GAG CGAGAT GT GAT CAAGCT GGT GTT AT CATAGTT GGAAAT CT GAAC 2520 

I I I M | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

C C CGGAT G GAAGGGAGCGAGAT GT GAT CAAGCT GGT GTT AT C AT AGT T GGAAAT CT GAAC 2520 

AGCTTAAGCCGAACCAGTACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCA 2580 

| | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

AGCTTAAGCCGAACCAGTACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCA 2580 

GGCATCATCATTCTTGTCCTAGTTGTTCTCTTCCTACTGGCATTGTTCATTATTTATAGA 2 640 

| | | I | I I I I I I I I I I I I I I I I I J I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GGCATCATCATTCTTGTCCTAGTTGTTCTCTTCCTACTGGCATTGTTCATTATTTATAGA 2 64 0 

CACAAGCAGAAGGGAAAGGAAT CAAGCAT GCCAGCAGTT AC CT ACACCC CT GCT AT GAGG 27 00 

Ml II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CACAAGCAGAAGGGAAAGGAAT CAAGCAT GCCAGCAGTTACCTACACCCCTGCTAT GAGG 2 7 00 

GTCGTCAATGCAGATTATACCATTTCAGGAACCCTTCCTCACAGCAATGGTGGAAACGCT 2760 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GT C GT CAAT GCAGAT TAT AC C ATT T CAG GAAC CCTT CCT CAC AGC AAT GGT GGAAAC GCT 2760 

AATAGC CACT ACT T CAC C AAT C C CAGT T AC CAC AC GCT CAC C C AGT GT GC CACAT C C CCT 2 820 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AAT AG C C AC T AC T T CAC C AAT C C CAG T T AC CAC AC GCT CAC C CAG T GT G C CACAT C C C C T 2820 

CAC GT CAAC AAC AG GGAC AG GAT GACT GT CAC GAAGT C AAAAAAC AAT C AAC T GT T T GT G 2 880 

| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CAC GT CAACAACAGG GACAGGAT GACTGT CAC GAAGT CAAAAAACAAT CAACT GT T T GT G 2 880 

AAT CTTAAAAATGT GAAC CCT GGGAAGAGAGGCCCTGT GGGGGACT GCACT GGGACATT G 2 940 
I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
AAT CT T AAAAAT GT GAACCCT GG GAAGAGAGG CC CT GT GGG GGACT GCACT G GGACAT T G 2940 

CCGGCTGACTGGAAACATGGCGGCTACCTCAACGAGCTCGGTGCTTTTGGACTTGACAGA 3000 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CCGGCTGACTGGAAACATGGCGGCTACCTCAACGAGCTCGGTGCTTTTGGACTTGACAGA 3000 

AGCT ATAT GGGAAAAT C CTT AAAAGAC C T GGGAAAGAAT T CT GAAT AT AAT T CAAGT AAC 3 060 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AGCTAT AT GGGAAAAT C CTT AAAAGAC CT G GGAAAGAAT T CT GAAT AT AAT T CAAGT AAC 3060 

T GCT C C CT AAGC AGT T C T GAGAAC C CAT AT GC CACT AT T AAAGAC C C AC CT GT ACT TAT C 312 0 
I I I I I I I I I I I II I I I I I I I I I I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I 
TGCTCCCTAAGCAGTTCTGAGAACCCATATGCCACTATTAAAGACCCACCTGTACTTATC 312 0 



3121 CC GAAAAGCTCAGAGT GTGGTTATGT GGAGAT GAAATCGCCGGCACGAAGAGATT CCCCA 3180 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
3121 C C GAAAAGCTCAGAGT GT GGTTATGT GGAGAT GAAATCGCCGGCACGAAGAGATTCCCCA 3180 

3181 TAT G C AGAGAT CAAT AACT CAACTTC AGC CAAC AG GAAT GT CTAT GAAGT T GAAC CT ACA 32 40 

I I I I I I I I I I ! I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

3181 T AT GCAGAGAT CAAT AACT CAACTT C AG C CAAC AGGAAT GT CTAT GAAGT T GAAC CT ACA 3240 

Qy 3241 GTGAGTGTT GT C CAAGGAGT ATT CAGCAATAAT GGGCGT CT CT CCCAGGATCCATAT GAC 3300 

I M I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 3241 GT GAGT GT T GT C CAAGGAGTAT T CAG CAATAAT GGGCGT CTCTCC CAG GATC CAT AT GAC 3300 

Qy 3301 CT C C C AAAGAAC AGT C AC AT C C CT T GT CAT TAT GAC CT GC T GC C AGT C C GAG AC AGT T CA 3360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 3301 CT CC CAAAGAACAGT CACAT C C CTT GTC AT TAT GAC CT GCT GC CAGT C C GAGACAGTT CA 3360 

Qy 3361 T C CT C CC CT AAG CAAGAGGACAGT GGAG GT AGC AGCAGCAAC AGCAGCAGCAGCAGT GAA 3420 

M I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 3361 TCCTCCCCTAAGCAAGAGGACAGTGGAGGTAGCAGCAGCAACAGCAGCAGCAGCAGTGAA 3420 

Qy 3421 TGA 3423 

III 

Db 3421 TGA 3423 



Qy 

Db 
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SOFTWARE: FastSEQ 
SEQ ID NO 19 
LENGTH: 3552 
TYPE: DNA 
ORGANISM: 
FEATURE : 
NAME/ KEY: CDS 
LOCATION: (1). 
US-10-365-227-19 



for Windows Version 4.0 



Homo sapiens 



(1803) 



Query Match 64.1%; 
Best Local Similarity 99.9%; 
Matches 2205; Conservative 



Score 2192.8; 
Pred. No. 0; 
0; Mismatches 



DB 14; Length 3552; 



2; Indels 



1; Gaps 



1; 



Qy 1216 GGGGAAG CTT GC CAG CAGAT CT GCAGCT GCCAAAAT GGG GC AGACT GT GACAGT GT GACT 1275 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 GGGGAAGCTT GCCAGCAGAT CT GCAGCT GC CAAAAT GGGGCAGACT GT GACAGT GT GACT 60 



Qy 

Db 

^Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1276 GGAAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCTACCCCATGCCCTCTG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GGAAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCTACCCCATGCCCTCTG 



61 



1336 



121 



GGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCT 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 

GGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCT 



1335 



120 



1395 



180 



1396 CCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGA 14 55 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

181 CCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGA 24 0 

1456 TGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGGGA 1515 

I I 1 1 I I I I 1 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I 

241 TGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGGGA 300 

1516 GCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAATGC 1575 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I 

301 GCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAATGC 360 

157 6 GAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGC 1635 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I 

361 GAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGC 42 0 

1636 CACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCGGGATGGTCAGGT 1695 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

421 CACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCCGGATGGTCAGGT 4 80 

1696 GTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGC 1755 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

4 81 GTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGC 54 0 

1756 TACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCACCAGGC 1815 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

541 TACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCACCAGGC 600 



Qy 

Db 



1816 
601 



TTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGC 1875 
I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I M II I I I I I I I I I I I I I I I I I I I I I I I I 
TTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGC 660 



Qy 


1876 


C AGACAT G C C CACAGTGC GTT CAC AGCAGC GGGC C CTGC CAC CACAT C ACC GGCCT GT GT 


1935 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Mill 




Db 


bbl 


LAbALAl CjL.UUAL.ALj 1 bjUbjl 1 LALAbL/ibLbbbL^L- i bUb/tbb/i^ni l o x ox 


720 


Qy 


1936 


GACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCAGATTT 


1995 




III 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 IN 




Db 


/Zl 


bAb 1 LjL, 1 1 LjL,L- 1 LtLjL- I 1 UAUAbrbjUbrUUL. ibl bbAAl b/Vib iblbl bbb/-Ybj i x x 


780 


Qy 


1996 


GGGAAAAACTGTGCAGGAATTTGTACCTGCACCAACAACGGAACCTGTAACCCCATTGAC 


2055 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 




Db 


n o i 

Vol 


GLfLfAAAAAL. 1 Lr 1 bjL-AbjbrAAl 1 lblAbbl bbAbbAAb/iAbbbm^Ulbl/VibbL-bni 1 \3l\^ 


840 


Qy 


2056 


AGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCTCAACCATGTCCACCT 


2115 




1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


841 


7\ /-'7\mr'rnrTi/- , rnr , 7\ rrrrmm^ r^/^^r^r^^T^T"'f r '*/^'*ATT/'* , f"" , f~' , Ar"'TP'AP , TP*P , TP < TP'A A P*P* ATP1TPP* APPT 

AGATCTTGT CAGTGT I ALCLL-LiGl 1 bL-rAl 1 LrbrUAbrl brAL. i LjL. ili uaauuai ui lua^i 


900 


Qy 


2116 


GCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCTGCAGC 


2175 




I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


901 


GCCCACTGGGGCCCAAAC1 bCAl LLALAbbl bbAAbl bbbAlAAl bbAbbl 1 ibibbAbb 


960 


Qy 


2176 


GCCTACGAT GGGGAAT GT AAAT GCACT CCT GGCT GGACAGGGCT CT ACT GC ACT C AGAGA 


2235 




1 1 1 1 1 1 1 1 1 1.| 1 1 1 1 1 1 I l I I I i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I 1 1 




Db 


961 


GCCTACGAT GGGGAAT G lAAAl bbAb Ibbl bbb 1 bbAUAbrbbb 1 lial JL uuau i u/\o/\o/\ 




Qy 


2236 


T GT C CT CT AGGGT T TTAT GGAAAAGAT T GT GCAC T GAT AT GCCAAT GT CAAAAC GGAGCT 


2295 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


i r\ o i 

102 1 


rnr'frpr < rpr , mArrr r r r r r i' r PA r rr*rA A a ArATTrTrrArTrATATnrrAATf^TrAAAArnnAnrT 
1 bl bbl b 1 Abbb 1 i 1 1A1 Lj bAAAAbr A 1 1 Lj 1 biUAU 1 bj/\l M.1 brbW-U-V± br 1 O ^rnu ^ x 


1080 


Qy 


2296 


GACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTCATGGGACGGCACTGT 


2355 




I | I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 




Db 


1081 


GACTGCGACCACATTTC1 bbbbAbl bl Abl 1 bbbbbAbl bbAl 1 bAl bibjbj/\UbjbrbM.bl brl 


114 0 

X X *i VJ 


Qy 


2356 


GAGCAGAAGTGCCCTTCAGGAACATATGGCTATGGCTGTCGCCAGATATGTGATTGTCTG 


2415 




| | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1141 


p7\rpArA a r , rnr , r t r*r , 'vrr* Arrs A c ATATrrrT AT C C CT fZT r* C2C C AP^ AT ATPTP ATTPTPTP 
GAGCAGAAG1 bbbb 1 1 b Ab b AAb A 1 A I bbbl Al bbbl bl bbbbAbjA±/\l brl bx/\l iblblb 


1200 


Qy 


2416 


AACAACT C CACCT GCGAC CACAT CACT GGGACCT GTTACT GCAGCCCC GGAT GGAAGGGA 


2475 




I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1201 


7\ t\ 7\ ArrTir i r , Ar , r , T | rrr A rr , Ar , A r rrAr <r rr'rrAr i r , I'rTTZirTrrZ\rirrrr^(^AT(^f!AAr^nA 
AACAAC I CbAbb 1 br b Lj AL. b Ab A 1 UAU i bbbAbb 1 br 1 1 >Vb I bb/\bjbbbbbibr/-\.l brb/A/\bibrOtt. 


1260 


Qy 


2476 


G C GAGAT GT GAT C AAGCT GGT GT T AT C AT AGT T GGAAAT C T GAACAGCTT AAGC C GAAC C 


2535 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 II 1 1 1 1 1 1 1 1 1 




Db 


1261 


/^r , r , A/^7\rn^rp^7\ r nr , A a rr"vrT' r P r*TT ATT AT A PTTrr A A ATPTf^A AP A CPTT A A PPPPAAPP 
bbbAbAl brl bAl bAAbb I bb I b 1 1 Al bAlAbl 1 LtLjAMM.! b 1 bi/\/\b/-iob 1 l/V\b^UVjMA^^ 


1320 


Qy 


2536 


AGTACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCAGGCATCATCATTCTT 


2595 




1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


loZ 1 


ArfpapTrrTrTrrrTrrTrATTrTTArrArATrn^nrrATTfifAGGrATCATCATTCTT 


1380 


Qy 


zoy o 


rrpr , r f rAr r P r rP r P f rPTP r rTPPTAP r T'PPPA r rTCTTrATTATTTATAnArArAARrAGAAGGGA 

bj x bb X Abl I br 1 iblbl 1 bb ± .Mb 1 bb»\^M.l loll v_»/n.X J- -rVX X I r\ l riun.^ri^yrvn.u orvvjnnvj wn. 


2655 




| | | | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1381 


GT C CT AGT T GT T CT CTT C CT ACT GGCATT GTT C AT TATT T AT AGAC ACAAGC AGAAGGGA 


1440 


Qy 


2656 


AAGGAAT CAAGCAT G C CAGCAGT T AC CTACAC C CCT GCT AT GAGGGT CGT CAAT GC AGAT 


2715 




| | | | | I I I I I I 1 1 1 1 1 1 | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 




Db 


1441 


AAGGAAT CAAGCATGC CAGCAGT T AC CTACAC C C CT GCT AT GAGGGT C GT CAAT GC AGAT 


1500 



Qy 


2716 


Db 


1501 


Qy 


2776 


Db 


1561 


Qy 


2836 


Db 


1621 


Qy 


2896 


Db 


1681 


Qy 


2956 


Db 


1740 


Qy 


3016 


Db 


1800 


Qy 


3076 


Db 


1860 


Qy 


3136 


Db 


1920 


Qy 


3196 


Db 


1980 


Qy 


3256 


Db 


2040 


Qy 


3316 


Db 


2100 


Qy 


3376 


Db 


2160 



T AT ACCAT TT CAGGAAC C CTT C CT C ACAGCAAT G GT GGAAAC GCT AAT AGC CACT ACT T C 2775 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I I I I I I I I I I 

TATACCATTTCAGGAACCCTTCCTCACAGCAATGGTGGAAACGCTAATAGCCACTACTTC 1560 

AC C AAT C C CAGT T AC C AC AC G CT C AC C C AGT GT G C C AC AT C C C C T C AC GT C AAC AAC AG G 2835 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I 

ACCAATCCCAGTTACCACACGCTCACCCAGTGTGCCACATCCCCTCACGTCAACAACAGG 1620 

G ACAG GAT G ACT GT CAC GAAG T C AAAAAAC AAT C AACT GT T T GT GAAT C T T AAAAAT GT G 2 895 
| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GACAGGAT GACT GT CAC GAAGT CAAAAAACAAT CAACT GT T T GT GAAT CTTAAAAAT GT G 1680 



I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I 

AACCCTGGGAAGAGAGGCCCTGTGGGGGACTGCA-TGGGACATTGCCGGCTGACTGGAAA 1739 

CAT GG C GGCT ACCT CAAC GAG CT C G GT GCT T T T GGACT T GAC AGAAG CT AT AT G GGAAAA 3015 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M II 

CATGGCGGCTACCTCAACGAGCTCGGTGCTTTTGGACTTGACAGAAGCTATATGGGAAAA 17 99 

T C CT T AAAAGAC CT GGGAAAGAAT T CT GAAT AT AATT CAAGTAACTGCT C C CT AAGC AGT 3075 
| | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
T C CT T AAAAGAC CT GG GAAAGAAT T CT GAAT AT AAT T CAAGT AACT GCT C C CTAAG CAGT 1859 



TCT GAGAAC C CAT AT G C CAC TAT T AAAGAC C CAC CT GT ACT T AT CC C GAAAAGCT CAGAG 3135 

I I I I I I I I I I I I I I I I I I I Ml I I I I II I I I I I I I I I 

T CT GAGAAC C CAT AT GCCACTAT T AAAGAC C CAC CT GTACT TAT CCCGAAAAGCT CAGAG 



1919 



T GT GGT T AT GT GGAGAT GAAAT C GC C GGC ACGAAGAGAT T C C C CAT ATGC AGAGAT C AAT 3195 

| | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 

T GT GGT TAT GT G GAGAT GAAAT C GC C GGC AC G AAGAGAT T C C C CAT AT G CAGAG AT C AAT 197 9 



MINIM II II I I II I I M M I I I M I I I I II II II M M M I I I II I M I I II 

AACT CAACT T C AGC CAACAGGAAT GT CT AT GAAGT T GAAC CT ACAGT GAGT GTT GT C CAA 2039 
GGAGT ATT CAGCAATAATGGGCGTCT CTCCCAGGAT CCATAT GACCTCCCAAAGAACAGT 3315 

| | I I I I I I M II II I II M I II I I I II I II II I M M I M M II I I I I M II I I I I M II 

GGAGT ATT CAGCAAT AAT GGG C GT CT CT C C CAGGAT CCATAT GAC CT C CCAAAGAACAGT 2099 
CAC AT C C CTT GT CAT TAT GAC CT GCT GC C AGT C C GAGAC AGT T CAT C CT C C C CT AAGCAA 3375 

I | | | | M I I I II I I I II I I I M I M I I I I I I I I I I II I II I I I I II I I I I I M I I M II I 



GAGGACAGTGGAGGTAGCAGCAGCAACAGCAGCAGCAGCAGTGAATGA 3423 

M I I II M I I M I II I II II II II I I M M II II II I M M M I M II 

GAGG ACAGT GGAGGTAGCAGCAGCAACAGCAGCAG CAG CAGT GAAT GA 2207 



RESULT 3 
US-10-092-390-3 

; Sequence 3, Application US/10092390 

; Publication No. US20030013865A1 

; GENERAL INFORMATION: 

; APPLICANT: Yu, Xuanchuan 

; APPLICANT: Miranda, Maricar 



; TITLE OF INVENTION: No. US200300138 65Alel Human EGF-Family Proteins and 
Polynucleotides Encoding the Same 
FILE REFERENCE: LEX-0317-USA 
CURRENT APPLICATION NUMBER: US/10/092, 390 
CURRENT FILING DATE: 2002-03-06 
PRIOR APPLICATION NUMBER: US 60/275,013 
PRIOR FILING DATE: 2001-03-12 
NUMBER OF SEQ ID NOS : 4 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 3 
LENGTH: 1761 
TYPE: DNA 

ORGANISM: homo sapiens 
US-10-092-390-3 

Query Match 51.4%; Score 1760; DB 14; Length 1761; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1760; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 ATGGTTATTTCTTTGAACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

Db 1 ATGGTTATTTCTTTGj°ACTCATGCCTGAGCTTTATTTGTTTATTGTTATGCCACTGGATT 60 

Qy 61 GGGAC AGCAT CAC CT CT GAAT CTT GAAGAC C CT AAT GTGT GT AG CC ACT G GGAAAGCTAC 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 G GGAC AGCAT CACCTCT GAAT CT T GAAGAC CCT AAT GT GT GTAG C C ACT GGGAAAGCT AC 12 0 

Qy 121 T C AGT GACT GT GC AAGAGT CATAC CC ACAT C C CT TT GAT CAAAT TT ACT ACAC GAGCT GC 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 T C AGT GACT GT GC AAGAGT CATAC C CAC AT C C C T T T GAT CAAAT T TACT AC AC GAG C T GC 18 0 

Qy 181 ACT GACATT CT AAACT GGTT T AAAT GCAC GCGGC ACAGAGT CAGCT AT C G GAC AGC CT AT 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 AC T GAC AT T CT AAAC T GGT T T AAAT GCAC GCGGC ACAGAGT C AG CT AT C GGAC AGC CT AT 24 0 

Qy 241 CGACAT GGGGAGAAGACT AT GT AT AGGC GCAAGT CT CAGT GTTGT CCT GGATTTT AT GAA 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 CGACAT GGG GAGAAGACTAT GT AT AGG C GCAAGT CT CAGT GT T GT C CT GGATT T TAT GAA 300 

Qy 301 AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCT 360 

I I I I I I Ill Ill I I I II I I I I I I I I I I I I I I 

Db 301 AGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCT 360 

Qy 361 CCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGAT 42 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 CCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGAT 420 

Qy 421 GGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGC 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 GGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCATWyVTGGGGCTCTGTGC 480 

Qy 481 AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 54 0 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 AACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGAC 54 0 



Qy 



541 C G CT GT GAG C AGGGC AC CT AT GGT AACGACT GT CAT CAGAGAT GC C AGT GCC AGAAT GGA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I Mill 



Db 

Qy 

Db 
Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 
Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 
Qy 

Db 



541 CGCT GTGAGCAGGGCACCTAT GGTAACGACTGT CAT CAGAGAT GCCAGTGCCAGAATGGA 600 

601 GCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTC 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I 

601 GCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTC 660 

661 T GT GAGGAT CT T T GT C CT C CT GGT AAAC AT GGT CC ACAGT GT GAG CAGAGAT GC C CT T GT 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
661 TGTGAGGATCTTTGTCCTCCTGGTAAACATGGTCCACAGTGTGAGCAGAGATGCCCTTGT 72 0 

721 CAAAAT GGAGGAGT GT GT CAT C ACGT CACT GGAGAAT GCT CT T GC C CTT CT GG CT GGAT G 78 0 

I II I I I I I II I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I 
721 CAAAAT G GAG G AGT GT GT CAT C AC GT CAC T GGAGAAT GCT CT TGCCCTTCTGGCTG GAT G 780 

781 GGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAA 84 0 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

781 GGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAA 84 0 

841 T GCCAGT GC CAT AAT GGAGGGAC GT GT GAT GCT GC C ACAGGC CAAT GT C ATT GCAGT CCA 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
841 T GCCAGT GCCAT AAT GGAGGGAC GT GT GAT GCT GC CACAGGCCAAT GT CATT GCAGT CCA 900 

901 G GAT ACAC AG GGGAAC GGT GC C AGGAT GAGT GT C CT GT T GGGAC CT AT G G C GT T CT CT GT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
901 GGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGT 960 

961 GCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
961 GCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTC 1020 

102 1 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1021 TGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTAC 108 0 

1081 GGC AT C AAAT GT GAC AAAC GGT GT C C CT GC CACT T G GAAAAC ACT C AT AG CT GT C AC C C C 1140 

I i I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1081 G G CAT C AAAT GT GAC AAAC GGTGTCCCTGC CAC T T G GAAAAC AC T CAT AG C T G T CAC C C C 114 0 

1141 ATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGT 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1141 ATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGT 12 00 

1201 TCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGAC 12 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

1201 TCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGAC 12 60 

1261 T GT G ACAGT GT GACT GGAAAGT GC AC CT GT GC C C C AGGAT T C AAAGGAAT T GACT G CT C T 1320 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1261 T GTGACAGT GT GACT GGAAAGT GCAC CT GT GC C C CAGGAT T CAAAGGAATT GACTG CT CT 1320 

1321 ACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAAT 1380 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1321 ACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAAT 138 0 

1381 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 1440 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1381 GATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTG 14 4 0 



Qy 


1441 


GACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAG 


1500 




1 I I I I I I I I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 M 1 1 




Db 


14 4 1 
14 11 


paptpptppatpapatp t TPPPAP,TP,PtPAPATGGGGCTTTGGCTGTAACTTAACATGCCAG 


1500 


Qy 


1501 


TGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 


1560 




1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1 jUI 


TrrrTrAArn^nA^rrTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGG 


1560 


Qy 


1561 


CGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAG 


1620 




| I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


lool 


pppppppapa a atpppaapttppptPtPPAGGATGGCACGTACGGGCTGAACTGTGCTGAG 


1620 


Qy 


1621 


CGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTC 


1680 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 Ml 




Db 


1 oz 1 


ppptpppaptppapppAPPtPAPATGGPTGPPACCCTACCACGGGCCATTGCCGCTGCCTC 


1680 


Qy 


1681 


CCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAAC 


1740 




i i i i ( I I I I I 1 1 1 1 1 1 1 I 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

I I M II M II II 1 1 t 1 1 1 1 1 1 I 1 1 1 1 1 1 I I I < I I I 1 1 1 1 1 1 11 1 11 1 11 1 11 1 




Db 


1681 


CCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAAC 


1740 


Qy 


IT/11 

1/41 


1 (jU 1 C^L. 1 bLLL 1 Lil^ 1 rW^ 1 o 1 / DU 






II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1741 


TGCTCCCTGCCCTGCTACTG 17 60 




RESULT 


4 






US-10- 


105-929 


-9 





Sequence 9, Application US/10105929 
Publication No. US20020137142A1 



; GENERAL INFORMATION: 

; APPLICANT: Holtzman, Douglas A. 

; APPLICANT: Goodearl, Andrew D.J. 

; TITLE OF INVENTION: TANGO-71, TANGO-73, TANGO-74, TANGO-76, AND TANGO-83 

; FILE REFERENCE: 09404/041001 

; CURRENT APPLICATION NUMBER: US/10/105, 929 

; CURRENT FILING DATE: 2002-03-25 

PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: 09/130,491 
; PRIOR FILING DATE: EARLIER FILING DATE: 1998-08-07 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: US 60/058,108 
; PRIOR FILING DATE: EARLIER FILING DATE: 1997-09-05 

; PRIOR APPLICATION NUMBER: EARLIER APPLICATION NUMBER: US 60/054,961 
; PRIOR FILING DATE: EARLIER FILING DATE: 1997-08-06 
; NUMBER OF SEQ ID NOS : 16 

SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 9 
; LENGTH: 1448 

TYPE: DNA 
; ORGANISM: Homo sapiens 
US-10-105-929-9 

Query Match 41.7%; Score 1425.8; 

Best Local Similarity 99.9%; Pred. No. 0; 
Matches 1427; Conservative 0; Mismatches 



DB 13; Length 144 8; 

2; Indels 0; Gaps 0; 



Qy 



1216 GG GGAAGCT TG C CAGCAGAT CT GCAGCT GC CAAAAT GGG G CAGACT GT GACAGT GT GACT 1275 

I I I I I I I I I I I I I I I I I I I M I I II I I I I I I Mill I I I I I I I I I I II I I 



Db 


18 


G GGGAAG CT TGC CAG C AGAT CT G CAGCT GCCAAAAT GGG G C AGACT GT GACAGT GT GACT 


77 


Qy 


LA 1 b 


m\T\7\ r*m/~r< a r , r , rp^rp/~r , r , r , r* Ar*r* A'P'Tr* A A ACT* A ATTP APTf^PTPT APfPf AT^PPPTPTG 
brbiAAAbj 1 brbAGG 1 G 1 bjb-UbUAbjbxAl 1 ^/Wibb/^/il X bj/\b.l b»b, 1 b^ ±/\^b.V-^.rtx icfv^^v^x v_« x 


1335 




II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


78 


GGAAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCTACCCCATGCCCTCTG 


137 


Qy 


1 3ob 


bbAALLlAl GGGAl AAAb. ibl 1 LL1 L. I Ubib 1 b?l bbl. 1 bi i/WW\l vjjw-vo iv^i x ^ x 


1395 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 




Db 


138 


GGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCT 


197 


Qy 


1396 


CCTGTGGACGGGTCTTGTAC1 1 GLAAGGbAGGGi bbjUAbbjbrtjbjl brbrAUl bbi bbAi 


1455 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


198 


CCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGA 


257 


Qy 


1456 


TGTCCCAGTGGCACA1 GGGGG 111 bbL 1 Gl AALI lAAbAl brbUAbjl LjL,u i br br ^ 


1515 




1 1 1 1 1 II 11 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


258 


TGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGGGA 


317 


Qy 


1516 


GCCTGCAACACCCTGGACGGGACClGCAGGlGibjb.Ab.bl bbrAl bbbbbbbbb/\bj/\MM.i b^ 


1575 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


318 


GCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAATGC 


377 


Qy 


1576 


GAACTTCCCTGCCAGGA1 GGCAGG1 Abbjbjbb.1 b/vVbl bl bbl bAbUbjb^i brb.bj/\b.i KjK^±\\jy^ 


1635 




M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 




Db 


378 


GAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGC 


437 


Qy 


1636 


CACGCAGATGGCTGCCACCCTAbbAbGbbbbAl 1 bbbbblbbbl bbbbbb.tt.i bibri v^tt.brbrl 


1695 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


438 


CACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCCGGATGGTCAGGT 


497 


Qy 


1696 


GTCCACTGTGACAGCG1 Gl Gl Gb 1 bAbbbAbbb 1 bbbbbbbbAKb. I bibi bbbi br^b-^ 1 


1755 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Ill II 




Db 


498 


GTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGC 


557 


Qy 


1756 


TACTGTAAAAATGGGGbl 1 bAl bblbbbblbAl bAl bbbAl b 1 bbb/\bi bi bib..rt.L^/\bJ^O 


1815 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 




Db 


558 


TACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCACCAGGC 


617 


Qy 


1816 


TTCCGAGGCACCAbl 1 Gl bAbAbbAl b I bbl bbbbl bbl ill 1 Ai brb7bib,/\i Ubr^ ± kjv^/vjv_ 


1875 




I I I I i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


618 


TTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGC 


677 


Qy 


lo /o 


bAbAbAl b b b LALAb 1 bbb 1 1 b-/\b-/\bj^/\bj^brbrbrb-b.v_- 1 o \_,lrW^ \^r\v^r\. l Urt^^uo^.^ x vj x \j x 


1935 




1 M 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


678 


CAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCACATCACCGGCCTGTGT 


737 


Qy 


i c\ o el 

1936 


r , 7\pfprr"r r T , r , r , n i r , rr , T i T i r , Ar'Arrrrrrr , Tr r rcr A ATfZA Af^TPTPTPPP APT PPP AG ATTT 

b Ab 1 bb 1 1 bbb 1 bbb 1 1 bAbAbbbb^<^\^ i b^ 1 bl-nnl \3l\r\xD 1 blui ULLniji yuununi x x 


1995 




1 1 1 1 1 1 1 1 1 t 1 1 i 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 

I M I 1 M II ! 1 1 1 1 II 1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 II II II 1 1 M M I M M i i i i i m i 




Db 


738 


GACT GCT T GC CTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGC AGAT TT 


797 


Qy 


1996 


GGGAAAAACTGTGCAGGAATTTGTACCTGCACCAACAACGGAACCTGTAACCCCATTGAC 


2055 




1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 




Db 


798 


GGGAAAAACT GT G C AGGAAT T T GT AC CT G C ACCAAC AAC GGAAC CT GT AAC CC CATT GAC 


857 



Qy 

Db 



2056 AGAT CTT GT CAGT GT T ACC C C G GT T G GAT T GG C AGT GACT GCT CT CAAC CAT GT C CAC CT 2115 

| I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 

858 AGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCTCAACCATGTCCACCT 917 



Qy 2116 GCCCACTGGGGCCCAAACTGCATCCACACGTGC7\ACTGCCATAATGGAGCTTTCTGCAGC 2175 

I I I I I I I I I I I | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 918 GCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCTGCAGC 977 

Qy 2176 GCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTCTACTGCACTCAGAGA 2235 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 978 GCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTCTACTGCACTCAGAGA 1037 

Qy 2236 T GT C CT C T AGGGTT T TAT GGAAAAGATT GT GC AC T GAT AT GCCAAT GT CAAAAC GGAG CT 2295 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1038 T GT C CT CT AGGGTTT T AT GGAAAAGATT GT GCAC T GAT AT G C CAAT GT CAAAAC GGAG CT 1097 

Qy 2296 GACTGCGACCACATTT CT GGGCAGT GTACTT GC C GCACT GGATT CATGGGACGGCACT GT 2355 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 1098 GACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTCATGGGACGGCACTGT 1157 

Qy 2356 GAGCAGAAGTGCCCTTCAGGAACATATGGCTATGGCTGTCGCCAGATATGTGATTGTCTG 2415 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1158 GAGCAGAAGT GC C CT T CAGGAACAT AT GGCT AT GGCT GT C GC CAGATAT GT GATT GT CT G 1217 

Qy 2416 AAC AACT C CAC CT GC GAC C AC AT CACT GGGAC C T GTT AC T GC AGC C C C GG AT G GAAGG GA 2475 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1218 AACAACT C CAC CT GC GAC CAC AT CACT GGGAC CT GTT ACT GCAGC CC C GGAT GGAAGG GA 1277 

Qy 247 6 GCGAGATGTGATCAAGCTGGTGTTATCATAGTTGGAAATCTGAACAGCTTAAGCCGAACC 2535 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1278 GC GAGAT GTGAT CAAGCT GGT GTT AT C AT AGT T GGAAAT CT GAACAGCT T AAGC C GAAC C 1337 

Qy 2536 AGTACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCAGGCATCATCATTCTT 2595 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I 1 I I I I 

Db 1338 AGTACTGCTCTCCCTGCTGATTCCTACCAAATCGGGGCCATTGCAGGCATCATCATTCTT 1397 

Qy 2596 GT C CT AGTT GTT CT CTT C CT ACTGG CAT T GT T CAT TAT T T AT AGAC ACA 2644 

I I I 11 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I.I I I I I I 

Db 1398 GT C CTAGT TGTTCTCTTC CTACTGGCAT T GT T CAT TAT T T ATAGACACA 1446 



RESULT 5 
US-10-365-227-9 

; Sequence 9, Application US/10365227 

; Publication No. US2 0030143632A1 

; GENERAL INFORMATION: 

; APPLICANT: McCarthy, Sean A. 

; APPLICANT: Holtzman, Douglas A. 

APPLICANT: Goodearl, Andrew D.J. 
; TITLE OF INVENTION: NOVEL GENES ENCODING PROTEINS HAVING 

; TITLE OF INVENTION: PROGNOSTIC, DIAGNOSTIC, PREVENTIVE, THERAPEUTIC AND 
OTHER 

; TITLE OF INVENTION: USES 

; FILE REFERENCE: 07334-323001 

; CURRENT APPLICATION NUMBER: US/10/365,227 

; CURRENT FILING DATE: 2003-02-12 

; PRIOR APPLICATION NUMBER: US/ 09/ 802 , 582 

PRIOR FILING DATE: 2001-03-08 
; PRIOR APPLICATION NUMBER: US 09/128,709 
; PRIOR FILING DATE: 1998-08-04 



PRIOR APPLICATION NUMBER: US 60/054,645 
PRIOR FILING DATE: 1997-08-04 
PRIOR APPLICATION NUMBER: US 09/130,491 
PRIOR FILING DATE: 1998-08-06 
PRIOR APPLICATION NUMBER: US 60/054,966 
PRIOR FILING DATE: 1997-08-06 
PRIOR APPLICATION NUMBER: US 60/058,108 
PRIOR FILING DATE: 1997-09-05 
PRIOR APPLICATION NUMBER: US 09/388,280 
PRIOR FILING DATE: 1999-09-01 
PRIOR APPLICATION NUMBER: US 09/388,279 
PRIOR FILING DATE: 1999-09-01 
NUMBER OF SEQ ID NOS : 20 

SOFTWARE: FastSEQ for Windows Version 4. 
SEQ ID NO 9 
LENGTH: 1448 
TYPE: DNA 

ORGANISM: Homo sapiens 
US-10-365-227-9 



Query Match 41.7%; 
Best Local Similarity 99.9%; 
Matches 1427; Conservative 



Score 1425.8; 
Pred. No. 0; 
0; Mismatches 



DB 14; Length 1448; 



2; Indels 



0; Gaps 



0; 



Qy 



Db 



1216 GGGGAAGCT T GC CAGCAGAT CT G CAGC T GC CAAAAT G GG GCAGACT GT GACAGT GT GACT 1275 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

18 GGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGACTGTGACAGTGTGACT 7 7 



Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1276 GGAAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCTACCCCATGCCCTCTG 1335 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GGAAAGTGCACCTGTGCCCCAGGATTCAAAGGAATTGACTGCTCTACCCCATGCCCTCTG 



78 



137 



1336 GGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCT 1395 

II I I I I I I I I I II I I I I I I I I II I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

138 GGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCT 197 



1396 CCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGA 
I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
198 CCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGA 

1456 TGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGGGA 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
258 TGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGGGA 



1455 



257 



1515 



317 



1516 GCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAATGC 1575 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

318 GCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAATGC 377 

1576 GAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGC 1635 

I I I I I I 1 1 1 1 I I 1 1 1 1 1 1 I 1 1 1 1 1 I I I 1 1 1 I 1 1 I I I I 1 1 I 1 1 i 1 1 I I I 1 1 1 1 1 I I I 1 1 I I 

378 GAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGC 437 

1636 CACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCGGGATGGTCAGGT 1695 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I 

438 CACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCCGGATGGTCAGGT 4 97 



Qy 



1696 



GTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGC 



1755 



Db 4 98 GTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGC 557 

Qy 1756 TACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCACCAGGC 1815 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 558 TACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCACCAGGC 617 

Qy 1816 TTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGC 1875 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I > I I I I I I M I I I I I I I I 

Db 618 TTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGC 677 

Qy 1876 CAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCACATCACCGGCCTGTGT 1935 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 678 CAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCACATCACCGGCCTGTGT 737 

Qy 1936 GACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCAGATTT 1995 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 738 GACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCAGATTT 797 

Qy 1996 GGGAAAAACT GT GCAGGAAT T T GT AC CT GC ACCAACAAC GGAAC CT GT AAC C C CATT GAC 2055 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 7 98 GG GAAAAACT GT GCAGGAAT T T GTAC CT GC AC CAAC AAC G GAAC CT GT AAC CC CATT GAC 857 

Qy 2056 AGAT CTT GT CAGT GTT ACCCC GGTT GGATT GGC AGT GACT GCT CT CAAC CAT GTCCAC CT 2115 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 858 AGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCTCAACCATGTCCACCT 917 

Qy 2116 GCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCTGCAGC 2175 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 918 GCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGAGCTTTCTGCAGC 977 

Qy 2176 GCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTCTACTGCACTCAGAGA 2235 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 978 GCCTACGAT GGGGAAT GTAAATGCACT CCT GGCT GGACAGGGCT CTACTGCACT CAGAGA 1037 

Qy 2236 T GT CCT CT AGGGTTTTATGGAAAAGATT GT GCACT GAT AT GC CAATGT CAAAACGGAGCT 2295 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1038 TGTCCTCTAGGGTTTTATGGAAAAGATTGTGCACTGATATGCCAATGTCAAAACGGAGCT 1097 

Qy 2296 GACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTCATGGGACGGCACTGT 2355 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1098 GACTGCGACCACATTTCTGGGCAGTGTACTTGCCGCACTGGATTCATGGGACGGCACTGT 1157 

Qy 2356 GAGCAGAAGTGCCCTTCAGGAACATATGGCTATGGCTGTCGCCAGATATGTGATTGTCTG 2415 

I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 

Db 1158 GAGCAGAAGTGCCCTT CAGGAACATAT GGCTAT GGCT GT CGCCAGATAT GT GATTGT CTG 1217 

Qy 2416 AACAACTCCACCTGCGACCACATCACTGGGACCTGTTACTGCAGCCCCGGATGGAAGGGA 247 5 

II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I M I I II I I I I I I I I I I I I I I I I I I I I I 

Db 1218 AACAACT C CAC CT GC GAC CACAT CACT GGGAC CT GT T AC T GCAGCC C CGGAT G GAAG GGA 1277 

Qy 2476 GC GAGATGT GAT CAAGCTGGTGTTATCATAGTT GGAAAT CTGAACAGCTTAAGCCGAACC 2535 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1278 GC GAGAT GT GAT CAAG CTGGT GT TAT CAT AGTT G GAAAT CT GAACAG CT TAAGC CGAAC C 1337 



Qy 



2536 AGTACTGCTCTCCCTGCTGATTCCTACCAGATCGGGGCCATTGCAGGCATCATCATTCTT 2595 
I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I II I I I I I I I I I I I I I 



Db 1338 AGTACTGCTCTCCCTGCTGATTCCTACCAAATCGGGGCCATTGCAGGCATCATCATTCTT 1397 

Qy 2596 GT C CT AGT TGTTCTCTT C CT ACT GGC AT T GT T CAT TAT T T AT AGAC AC A 2644 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 
D b 1398 GT C CT AGT TGTTCTCTT CC T ACT G G CAT T G T T CAT TAT T TAT AGAC AC A 144 6 
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; NUMBER OF SEQ ID NOS : 97 
; SOFTWARE: Patentln Ver. 2.1 
; SEQ ID NO 9 

LENGTH: 3114 

TYPE: DNA 

ORGANISM: Homo sapiens 
US-10-052-648A-9 

Query Match 18.7%; Score 640.6; DB 15; Length 3114; 

Best Local Similarity 57.5%; Pred. No. 1.6e-196; 

Matches 1235; Conservative 0; Mismatches 899; Indels 15; Gaps 4; 

Qy 62 G GACAGC AT CAC CT CT GAAT CTT GAAGAC C CT AAT GT GT GT AGCCACT GGGAAAGCTACT 121 

II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 47 G G CT G GCT GGAACT CT CAAC CC CAGT GAT C CCAAT AC CT G CAGCTT CT G GGAAAGCTT C A 106 

Qy 122 CAGTGACTGTGCAAGAGTCATACCCACATCCCTTTGATCAAATTTACTACACGAGCTGC- 18 0 

I II I I I I I I I I I I I I I I I I I II I I I I I 

Db 107 CTACCACCACCAAGGAGTCCCACTCCCGCCCCTTCAGCCTGCTCCCCTCAGAGCCCTGCG 166 

Qy 181 — ACTGACATTCTAAACTGGTTTAAATGCACGCGGCACAGAGTCAGCTATCGGACAGCCT 238 

I I I I II I I I I I I I I I I I I I I I I I I 

Db 167 ... AGCGGCCCTGGGAGGGCCCCCATACTTGCCCCCAGCCCACGGTTGTATACCGGACCGTGT 22 6 

Qy 239 AT CGACAT GGGGAGAAGACTAT GTATAGGCGCAAGT CTCAGTGTTGT CCT GGATTTTATG 298 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 227 ACCGTCAGGTGGTGAAGACGGACCACCGCCAGCGCCTGCAGTGCTGCCATGGCTTCTATG 286 

Qy 299 AAAGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTG 358 

I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 287 AGAGCAGGGAGTTCTGTGTCCCGCTCTGTGCCCAGGAGTGTGTCCATGGCCGTTGTGTGG 34 6 

Qy 359 CTCCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCG 418 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 347 CACCCAATCAGTGCCAATGTGTGCCAGGCTGGCGGGGCGACGACTGTTCCAGTGAGTGTG 406 

Qy 419 ATGGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGT 47 8 

I I I I I I I I I I I I II I III III II II 

Db 4 07 CCCCAGGAATGTGGGGGCCACAGTGTGACAAGCCCTGCAGTTGCGGCAACAACAGCTCGT 4 66 

Qy 47 9 GCAACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGG 538 

I I I I I I I I I I I II II I I I I I III I I I I II 

Db 4 67 GTGATCCCAAGAGTGGGGTATGTTCTTGCCCTTCTGGTCTGCAGCCCCCGAACTGCCTTC 52 6 

Qy 539 AC CGCT GT GAGC AGGGCACC T AT GGTAAC GACT GT CAT C AGAGAT G C C AGT GC CAGAAT G 598 

I I I I I I I II I I I I I I I I I I I II I I I I I I II I I I III 

Db 527 AGCCCTGTACCCCTGGCTACTATGGCCCTGCCTGCCAGTTCCGCTGCCAGTGCC ATG 583 

Qy 599 GAGCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCT 658 

III I I I I II I I I I I I I I III I I I I I I Ml I I I I M 

Db 584 GGGCACCCTGCGATCCCCAGACTGGAGCCTGCTTCTGCCCCGCAGAGAGAACTGGGCCCA 643 

Qy 659 TCTGTGAGGATCTTTGTCCTCCTGGTAAACATGGTCCACAGTGTGAGCAGAGATGCCCTT 718 

I I I I I I I I I I I I I I I Ml II I I I I I 

Db 644 GCTGTGACGTGTCCTGTTCCCAGGGCACTTCTGGCTTCTTCTGCCCCAGCACCCATCCTT 703 

Qy 719 GTCAAAATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGA 778 



II I I I I I I I I I I Mill I I I I I I I I I 

704 GCCAAAATGGAGGTGTCTTCCAAACCCCACAGGGCTCCTGCAGCTGCCCCCCTGGCTGGA 763 

779 TGGGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAG 838 

I I I I I I I IN I I I I I I I I I I I I I I III I I I I I I I I I I I 

7 64 TGGGCACCATCTGCTCCCTGCCCTGCCCAGAGGGCTTTCACGGACCCAACTGCTCCCAGG 823 

839 AAT G C C AGT GCC AT AAT GGAG GGACGT GT GAT GCT GC C ACAGGC CAAT GT C ATT GCAGT C 898 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I Ml M 

824 AATGTCGCTGCCACAACGGCGGCCTCTGTGACCGATTCACTGGGCAGTGCCGCTGCGCTC 8 83 

899 CAGGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCT 958 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M IN M 

884 CGGGTTACACTGGGGATCGGTGCCGGGAGGAGTGCCCGGTGGGCCGCTTTGGGCAGGACT 943 

959 GTGCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCC 1018 

I I I I I I | I II I i I I I I I I I I I I I I I I I II I I I I I I I I 

944 GTGCTGAGACGTGCGACTGCGCCCCGGACGCCCGTTGCTTCCCGGCCAACGGCGCATGTC 1003 

1019 TCTGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCT 1078 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I M 

1004 TGTGCGAACACGGCTTCACTGGGGACCGCTGCACGGATCGCCTCTGCCCCGACGGCTTCT 1063 

1079 AC GGC AT CAAAT GT GAC AAAC GGT GT C C CT GC C ACT T G GAAAAC AC T C AT AGCT GT C AC C 1138 

I I I I III II I I II I I I I I I I III III I I I I I I I I I I 

1064 ACGGTCTCAGCTGCCAGGCCCCCTGCACCTGCGACCGGGAGCACAGCCTCAGCTGCCACC 1123 

1139 CCAT GT CT GGAGAGT GT GC CT GCAAGCCGGGCT GGTCAGGACT CTACT GT AAT GAGAC AT 1198 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I 
1124 CGATGAACGGGGAGTGCTCCTGCCTGCCGGGCTGGGCGGGCCTCCACTGCAACGAGAGCT 1183 

1199 GTTCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAG 1258 

I II I I I I I II I I I I I I I I III I I I I I INI I 

1184 GCCCGCAGGACACGCATGGGCCAGGGTGCCAGGAGTACTGTCTCTGCCTGCACGGTGGCG 124 3 

1259 AC T GT GAC AGT GT GACT GGAAAGT GC AC CTGTGCCC C AG GAT T C AAAG GAAT T G ACT GCT 1318 

I M | | I II II I I I I II I I I I I II I I I I I 

1244 TCTGCCAGGCTACCAGCGGCCTCTGTCAGTGCGCGCCGGGTTACACGGGCCCTCACTGTG 1303 

1319 CTACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAA 137 8 

Ml I MINI I II I I I I I I I I I I I I I I I I I I II II I I I I 

1304 CTAGTCTTTGTCCTCCTGACACCTACGGTGTCAACTGTTCTGCACGCTGCTCATGTGAAA 1363 

1379 ATGATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGG 1438 

III I I I I I I I I I I II I I I I II I 

1364 ATGCCATCGCCTGCTCACCCATCGACGGCGAGTGCGTCTGCAAGGAAGGTTGGCAGCGTG 1423 

1439 TGGACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCC 1498 

I II II I I I I I I I I I I I I I I I I M I I I I I I I II I I 

1424 GTAACTGCTCTGTGCCCTGCCCACCCGGAACCTGGGGCTTCAGTTGCAATGCCAGCTGCC 14 83 

1499 AGTGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGAT 1558 

I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 

1484 AGTGTGCCCATGAGGCAGTCTGCAGCCCCCAAACTGGAGCCTGTACCTGCACCCCTGGGT 1543 

1559 GGCGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTG 1618 
I I I ' I I I I I I I I I I I I I I I I I I I I I II M I I I I I 



Db 



1544 GGCATGGGGCCCACTGCCAGCTGCCCTGTCCGAAGGGGCAGTTTGGAGAAGGTTGTGCCA 1603 



Qy 1619 AGCGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCC 1678 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml III I I I I 

Db 1604 GTCGCTGTGACTGTGACCACTCTGATGGCTGTGACCCTGTTCATGGACGCTGTCAGTGCC 1663 

Qy 1679 TCCCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCA 1738 

II I I I I I I I I II M I I M M I I I I I I II 

Db 1664 AGGCTGGCTGGATGGGTGCCCGCTGCCACCTGTCCTGCCCTGAGGGCTTATGGGGAGTCA 1723 

Qy 17 39 ACTGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCT 1798 

I I I I I I I I I I I I I I I I I I I I I I Ml II I I I I I I I I I I I I 

Db 1724 ACTGTAGCAACACCTGCACCTGCAAGAATGGGGGCACCTGTCTCCCTGAGAATGGCAACT 1783 

Qy 1799 GCGAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTT 1858 

I II I I II I I I I I I I I I I I I I I I II I I I I I I I I Ml M I I I I 

Db 1784 GCGTGTGTGCACCCGGATTCCGGGGCCCCTCCTGCCAGAGATCCTGTCAGCCTGGCCGCT 1843 

Qy 1859 ATGGGCATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACC 1918 

I I M I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1844 ATGGCAAACGCTGTGTGC CCTGCAAGTGCGCTAACCACTCC TTCTGCCACC 1894 

Qy 1919 ACATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGT 1978 

I I M I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1895 CCTCGAACGGGACCTGCTACTGCCTGGCTGGCTGGACAGGCCCCGACTGCTCCCAGCCAT 1954 

Qy 1979 GTCCCAGT GGCAGATTT GGGAAAAACT GT GCAGGAATTT GT AC CT GC ACCAACAACGGAA 2 038 

Ml II I I I I I I I I I I I I I IN M M Ml 

Db 1955 GCCCTCCAGGACACTGGGGAGAAAACTGTGCCCAGACCTGCCAATGTCACCATGGTGGGA 2014 

Qy 2 039 CCTGTAACCCCATTGACAGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCT 2098 

I I I I I I I I II I Ml I I I II I I I I I I I I I 

Db 2015 CCTGCCATCCCCAGGATGGGAGCTGTATCTGCCCCCTAGGCTGGACTGGACACCACTGCT 2074 

Qy 2099 CTCAACCATGTCCACCTGCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATA 2158 

II I I I I I I I II I I I I I I I III I I I I I I I I 

Db 2075 TAGAAGGCTGCCCTCTGGGGACATTTGGTGCTAACTGCTCCCAGCCATGCCAGTGTGGTC 2134 

Qy 2159 ATGGAGCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGG 2207 

II I I I III II I I I I I I III IN I I I I I 

Db 2135 CT GGAGAAAAGT GC CAC CC AGAGACT GGGGC CT GT GT AT GT C C C CC AGG 2183 
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NUMBER OF SEQ ID NOS : 97 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 7 
LENGTH: 5000 
TYPE: DNA 

ORGANISM: Homo sapiens 
US-10-052-648A-7 

Query Match 18.6%; Score 635.8; DB 15; Length 5000; 

Best Local Similarity 57.3%; Pred. No. 7.9e-195; 

Matches 1232; Conservative 0; Mismatches 902; Indels 15; Gaps 4; 

Qy 62 GGACAGC AT C AC CT CT GAAT CTT GAAGAC C CT AAT GT GT GT AGC CACT GGGAAAG CT ACT 121 

II || I I I I I I I I 11 II III Mill I I I I I I I I I I I I 

Db 129 GGCTGGCTGGAACTCTCAACCCCAGTGATCCCAATACCTGCAGCTTCTGGGAAAGCTTCA 188 

Qy 122 CAGT GACT GT GCAAGAGT CAT AC C CACAT C C CTTT GAT CAAATT T ACTACACGAGCT GC - 180 

I II I I I I I I I I I I I I I I I I I M I I I I I 

Db 189 CTACCACCACCAAGGAGTCCCACTCCCGCCCCTTCAGCCTGCTCCCCTCAGAGCCCTGCG 248 



Qy 181 - - ACT GAC AT T CT AAACT GGT TTAAAT GCAC GC G GC ACAGAGT CAGCT AT C GGACAGC CT 238 



Db 249 AGCGGCCCTGGGAGGGCCCCCATACTTGCCCCCAGCCCACGGTTGTATACCGGACCGTGT 308 

Qy 239 AT CGACAT GGGGAGAAGACT AT GTAT AGGCGCAAGT CT CAGT GTT GT CCT GGATTTTAT G 2 98 

I I I I I I I I I I I I I I III I I I I I I I I I I I II I I I I 

Db 309 ACCGTCAGGTGGTGAAGACGGACCACCGCCAGCGCCTGCAGTGCTGCCATGGCTTCTATG 368 

Qy 299 AAAGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTG 358 

I ill III I I I I I I I I I I I I I I I I I 1 1 I I I I I I I 

Db 369 AGAGCAGGGGGTTCTGTGTCCCGCTCTGTGCCCAGGAGTGTGTCCATGGCCGTTGTGTGG 428 

Qy 359 CTCCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCG 418 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 429 CAC C CAAT CAGTGC CAAT GT GT GC CAGG CT GGCGGGGC GAC GACT GTT C CAGT GAGT GT G 4 88 

Qy 419 ATGGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGT 47 8 

I I I I I I I I I I I I I I I I I I III II II 

Db 4 89 CCCCAGGAATGTGGGGGCCACAGTGTGACAAGCCCTGCAGCTGCGGCAACAACAGCTCGT 548 

Qy 479 GCAACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGG 538 

I' Mill I INI II I I I I I I I I I I I I II I I 

Db 549 GTGATCCCAAGAGTGGGGTATGTTCTTGCCCTTCTGGTCTGCAGCCCCCGAACTGCCTTC 608 

Qy 539 ACCGCT GT GAGC AGGGCACCTAT GGTAACGACT GT CAT CAGAGAT GCCAGT GCCAGAATG 598 

I I I I I I I III I I I I I I I I I I I I I I I I I I I I I II III 

Db 609 AGCCCTGTACCCCTGGCTACTATGGCCCTGCCTGCCAGTTCCGCTGCCAGTGCC ATG 665 

Qy 599 GAGCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCT 658 

III I I I I I I I I I I I I I I III I I I I I I III I I I I II 

Db ' 666 GGGCACCCTGCGATCCCCAGACTGGAGCCTGCTTCTGCCCCGCAGAGAGAACTGGGCCCA 725 

Qy 659 TCT GT GAGGAT CTT TGT CCT CCT GGTAAACAT GGT CCACAGT GT GAGCAGAGAT GCCCTT 718 

I I I I I I I I I I I I I I I III II I Mil 

Db 726 GCTGTGACGTGTCCTGTTCCCAGGGCACTTCTGGCTTCTTCTGCCCCAGCACCCATCCTT 7 85 

Qy 719 GTCAAAATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGA 778 

I I I I I I I I I I I I I I I II I II Ml I I I I I I I I I I I I I I 

Db 786 GCCAAAATGGAGGTGTCTTCCAAACCCCACAGGGCTCCTGCAGCTGCCCCCCTGGCTGGA 845 

Qy 779 TGGGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAG 8 38 

I I I I I I I III I I I I I I I I I I I I I I III I I I II I I I I I I 

Db 84 6 TGGGCACCATCTGCTCCCTGCCCTGCCCAGAGGGCTTTCACGGACCCAACTGCTCCCAGG 905 

Qy 839 AAT GC CAGT GC CAT AAT G GAG GGAC GT GT G AT GCT G C C AC AGG C CAAT GT CAT T GC AGT C 8 98 

I I I I I I I I I I I I I I I I Mill II I II M I I I III II 

Db 906 AATGTCGCTGCCACAACGGCGGCCTCTGTGACCGATTCACTGGGCAGTGCCGCTGCGCTC 965 

Qy 899 CAGGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCT 958 

I II II I II II II I I I I I I II I II II I I I I I I I II II II I M 

Db 966 CGGGTTACACTGGGGATCGGTGCCGGGAGGAGTGCCCGGTGGGCCGCTTTGGGCAGGACT 1025 

Qy 959 GTGCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCC 1018 

I II I I I I I II II I I I I I I I I I I I I I I I II I I II II I I 

Db 1026 GTGCTGAGACGTGCGACTGCGCCCCGGACGCCCGTTGCTTCCCGGCCAACGGCGCATGTC 1085 



Qy 1019 T CT GT GAAGCAGG CTT T GCT G GC GAGC GCT GCGAAGCACGCCTGTGT CCT GAG GGGC TCT 1078 

I II M I II II I I II I I I I I II II I II I II M I I II M III 



Db 1086 TGTGCGAACACGGCTTCACTGGGGACCGCTGCACGGATCGCCTCTGCCCCGACGGCTTCT 1145 

Qy 1079 AC GGCAT CAAAT GT GACAAACGGT GT CCCT GC CACT T GGAAAACACT CAT AGCT GT CAC C 1138 

I II I III I I I I I I I II I I I II! Ill I I I I I I I I I I 

Db 1146 ACGGTCTCAGCTGCCAGGCCCCCCGCACCTGCGACCGGGAGCACAGCCTCAGCTGCCACC 1205 

Qy 1139 CCAT GT CT GGAGAGT GTGC CT GCAAGCCGGGCT GGT CAGGACTCT ACT GT AAT GAGACAT 1198 

I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

Db 1206 CGATGAACGGGGAGTGCTCCTGCCTGCCGGGCTGGGCGGGCCTCCACTGCAACGAGAGCT 1265 

Qy 1199 GTTCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAG 1258 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 12 66 GCCCGCAGGACACGCATGGGCCAGGGTGCCAGGAGCACTGTCTCTGCCTGCACGGTGGCG 1325 

Qy 1259 ACTGT GACAGTGT GAC T GGAAAGT GC ACCT GT GCC C CAGGAT T CAAAG GAAT T GACT GCT 1318 

I I I I I III II I I I I I I I I I I I I I I I I I I 

Db 1326 TCTGCCAGGCTACCAGCGGCCTCTGTCAGTGCGCGCCGGGTTACACGGGCCCTCACTGTG 1385 

Qy 1319 CTACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAA 1378 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1386 CTAGTCTTTGTCCTCCTGACACCTACGGTGTCAACTGTTCTGCACGCTGCTCATGTGAAA 1445 

Qy 137 9 ATGATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGG 1438 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II 

Db 1446 ATGCCATCGCCTGCTCACCCATCGACGGCGAGTGCGTCTGCAAGGAAGGTTGGCAGCGTG 1505 

Qy 1439 TGGACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCC 14 98 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1506 GTAACTGCTCTGTGCCCTGCCCACCCGGAACCTGGGGCTTCAGTTGCAATGCCAGCTGCC 1565 

Qy 14 99 AGTGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGAT 1558 

I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I II I I I 

Db 1566 AGT GT GC C CAT GAG GCAGT CT G CAGCCC C CAAACT G GAG CCT GTAC CT G CAC CC CT GGGT 1625 

Qy 1559 GGCGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTG 1618 

III I I I I I I I I I I I I I I I I I I I I I II M Mill 

Db 1626 GGCATGGGGCCCACTGCCAGCTGCCCTGTCCGAAGGGGCAGTTTGGAGAAGGTTGTGCCA 1685 

Qy 1619 AGCGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCC 1678 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I III III I I I I 

Db 1686 GTCGCTGTGACTGTGACCACTCTGATGGCTGTGACCCTGTTCATGGACGCTGTCAGTGCC 1745 

Qy 1679 TCCCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCA 1738 

I I I I I I I I I I I I I I I II II I I I I I I I I I I II II 

Db 17 46 AGGCTGGCTGGATGGGTGCCCGCTGCCACCTGTCCTGCCCTGAGGGCTTATGGGGAGTCA 1805 

Qy 1739 ACTGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCT 1798 

I I I I I Mill I I I I I I I I I I I I III I I I I I I I I I I II I I 

Db 1806 ACTGTAGCAACACCTGCACCTGCAAGAATGGGGGCACCTGTCTCCCTGAGAATGGCAACT 1865 

Qy 1799 GCGAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTT 1858 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I Ml M I II I 

Db 1866 GCGTGTGTGCGCCCGGATTCCGGGGCCCCTCCTGCCAGAGATCCTGTCAGCCTGGCCGCT 1925 

Qy 1859 AT G GGCAT C GC T G C AGC C AGAC AT GCC CAC AGT GC GT T C AC AGC AGC G GGCCCTGC CAC C 1918 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1926 ATGGCAAACGCTGTGTGC C CT GCAAGT GCGCTAAC C ACT CC TTCTGCCACC 1976 



Qy 1919 ACATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGT 1978 

I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1977 CCTCGAACGGGACCTGCTACTGCCTGGCTGGCTGGACAGGCCCCGACTGCTCCCAGCCAT 2036 

Qy 1979 GTCCCAGTGGCAGATTTGGGAAAAACTGTGCAGGAATTTGTACCTGCACCAACAACGGAA 2038 

III I I I II I I I I I I I I I I I II II M HI 

Db 2037 GCCCTCCAGGACACTGGGGAGAAAACTGTGCCCAGACCTGCCAATGTCACCATGGTGGGA 2096 

Qy 2039 CCTGTAACCCCATTGACAGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCT 2098 

I I I I I I I I II I I I I M M MINIMI I I I I I I 

Db 2097 CCTGCCATCCCCAGGATGGGAGCTGTATCTGCCCCCTAGGCTGGACTGGACACCACTGCT 2156 

Qy 2099 CTCAACCATGTCCACCTGCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATA 2158 

II I I I I I I I I I I I I I I I I I I I I I I I I M I 

Db 2157 TAGAAGGCTGCCCTCTGGGGACATTTGGTGCTAACTGCTCCCAGCCATGCCAGTGTGGTC 2216 

Qy 2159 ATGGAGCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGG 2207 

I I I I I III II I I I I I I III Ml I I I II 

Db 2217 CTGGAGAAAAGTGCCACCCAGAGACTGGGGCCTGTGTATGTCCCCCAGG 2265 
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Qy 67 GCATCACCTCTGAATCTTGAAGACCCTAATGTGTGTAGCCACTGGGAAAGCTACTCAGTG 126 

II I I I I I I I I I II I I I II I I II I I I I I I I II I I I I I I 

Db 236 GCTGGAACACTCAACTCCAATGATCCCAATGTCTGTACCTTCTGGGAAAGCTTCACCACG 2 95 

Qy 127 ACT GT G CAAGAGT CAT AC C C AC AT C C CTT T GAT CAAATTT ACTAC AC GAGCT G CACT 183 

II I Mill III I Mill I II I II M 

Db 296 ACCACTAAGGAGTCCCACCTTCGCCCCTTCAGCCTGCCCCCAGCCGAGTCCTGCGACAGG 355 

Qy 184 GACAT T CTAAACT GGTT T AAAT GCACGC G G C ACAGAGT CAGCT AT CGGACAGCCT AT C GA 2 43 

I III I I I I I I I I I M II I I I I I I I I I I I 

Db 356 CCCTGGGAAGACCCCCACACCTGCGCTCAGCCTACGGTTGTCTACCGGACTGTGTACCGT 415 

Qy 244 CAT G GG GAGAAGACT AT GT AT AGGC GCAAGT C TCAGT GT T GT C CT GGAT TT T AT GAAAGC 303 

I I I I I I I I I I I II II I I I I I I I I I I 

Db 416 CAGGTGGTGAAGATGGACTCCCGCCCACGCCTGCAGTGCTGTGGGGGTTACTACGAGAGC 475 



Qy 304 GGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCTCCA 363 

III I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 476 AGTGGAGCCTGTGTCCCACTCTGTGCCCAGGAGTGTGTCCACGGTCGCTGTGTGGCTCCT 535 

Qy 364 AACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGATGGT 42 3 

I! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I 

Db 536 AATCGGTGCCAGTGTGCACCAGGCTGGCGGGGTGACGACTGTTCCAGTGAGTGTGCTCCT 595 

Qy 424 GATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGCAAC 4 83 

I I I I I I I I I I I I I I I I I II 

Db 596 GGAAT GT G GGGACCACAGT GT GACAGGCT CT GCCT CT GT GG CAAC AGC AGTT C CT GT GAT 655 

Qy 484 CCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGACCGC 54 3 

I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 656 CCCAGGAGTGGGGTGTGTTTTTGCCCCTCTGGCCTGCAGCCCCCCGACTGCCTTCAGCCT 715 

Qy 54 4 T GT GAGCAGGGCAC CT AT G GT AAC GACT GT CAT C AGAGAT GC CAGT GCC AGAAT GGAGC C 603 

II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 716 TGCCCCGATGGCCACTATGGTCCTGCCTGCCAGTTTGATTGCCATTGC TATGGGGCA 772 

Qy 604 ACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTCTGT 663 

I I I I I I I I I III III I I I I I I I I I I I I I II I I I 

Db 77 3 TCCTGTGAGGCCCGGGATGGAGCCT.GCTTCTGCCCCCCAGGGAGAACAGGACCCAGGGCA 832 

Qy 664 GAGGAT CTTTGT CCT CCT GGT AAACAT GGT CCACAGTGT GAGCAGAGAT GC CCT T GT CAA 723 

I I I I I I I III I I I I I I II I 

Db 833 CTGATGGCTTCTTCTGCCCCAGAAC TTATCCTTGCCAA 870 

Qy 724 AATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGATGGGC 7 83 

I I I I I I I I I I III I II III I I I I I I I I I I I I I I I I I 

Db 871 AATGGAGGTGTTCCTCAGGGCTCTCAAGGCTCCTGCAGCTGCCCACCGGGCTGGATGGGT 930 

Qy 784 ACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAATGC 84 3 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 931 GTCATCTGTTCCCTGCCATGCCCAGAGGGTTTCCACGGACCCAACTGTACTCAGGAATGT 990 

Qy 84 4 CAGT GC CAT AAT GGAGGGAC GT GT GAT G CT GC C AC AGG C C AAT GT CAT T GCAGT C C AGGA 903 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 991 CGTTGCCACAATGGTGGCCTTTGTGACAGGTTTACTGGGCAGTGCCACTGTGCTCCTGGC 1050 

Qy 904 TACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGTGCT 963 

III I I I I I I I I I I I I I I I I I I I I I I I I I I II II II I I I I I 

Db 1051 TATATCGGGGATCGGTGCCGTGAAGAGTGCCCTGTGGGCCGCTTCGGTCAAGACTGTGCT 1110 

Qy 964 GAG AC CT GC CAGT GT GT CAAC G GAG G GAAGT GT T AC C AC GT GAG C G G C GC AT GC CT CT GT 1023 

I I I I I I I I I I I I I II I III I I I II I I I I I I I I I 

Db 1111 GAGACCTGTGACTGTGCTCCTGGCGCTCGTTGCTTTCCTGCCAATGGCGCGTGTCTGTGC 1170 

Qy 1024 GAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGC 1083 

III I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1171 GAACATGGCTTCACAGGCGACCGCTGCACTGAGCGACTCTGTCCAGATGGCCGCTATGGT 1230 

Qy 1084 AT C AAAT GT GAC AAAC GGTGTCCCTGC C AC T T G G AAAAC AC T C AT AGCT GT C AC C C CAT G 1143 

II II I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1231 CT GAGCT GCCAAGAT C C CT GC ACCT GC GAC C CAGAACAC AGT CT C AGCT GC C AC C CAAT G 1290 



Qy 1144 TCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGTTCT 1203 



Db 1291 CACGGCGAGTGCTCCTGCCAGCCAGGTTGGGCGGGCCTCCACTGCAACGAGAGCTGCCCT 1350 

Qy 1204 CCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGACTGT 1263 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 1351 CAGGACACGCACGGAGCCGGTTGCCAGGAGCACTGCCTCTGTCTGCACGGCGGTGTTTGC 1410 

Qy 1264 GACAGT GT GACT GGAAAGT GCACCT GTGC C CCAGGAT T CAAAGGAAT T GACT GCT CT AC C 1323 

I I I II III I I I I I I I I I I I I III I I I I I I I I I 

Db 1411 CTCGCCGACAGCGGCCTCTGCCGGTGTGCACCTGGCTACACGGGACCTCACTGCGCTAAT 1470 

Qy 1324 CCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAA7\AATGAT 1383 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

Db 1471 CTTTGTCCACCTAACACTTATGGGATCAACTGTTCCTCCCACTGCTCCTGTGAAAATGCC 1530 

Qy 1384 GCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGAC 1443 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 1531 ATTGCCTGCTCTCCTGTCGACGGCACGTGCATCTGCAAGGAAGGTTGGCAGCGTGGTAAC 1590 

Qy 1444 T GCT CCAT CAGAT GT CCCAGT GGCACAT GGGGCTTT GGCTGTAACTT AACAT GCCAGT GC 1503 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1591 TGCTCTGTGCCCTGTCCCCCTGGCACCTGGGGCTTCAGTTGCAATGCCAGTTGCCAGTGT 1650 

Qy 1504 CTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGC 1563 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

Db 1651 GCCCACGAGGGAGTCTGCAGCCCCCAAACTGGAGCCTGTACTTGCACCCCTGGGTGGCGT 1710 

Qy 1564 GGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGC 1623 

I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I 

Db 1711 GGGGTTCACTGCCAACTTCCGTGCCCGAAGGGACAGTTTGGTGAAGGTTGTGCCAGTGTC 1770 

Qy 1624 TGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCG 1683 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1771 TGTGACTGTGACCACTCCGATGGCTGTGACCCTGTTCATGGACACTGCCGATGTCAGGCT 1830 

Qy 1684 GGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGC 1743 

I I I I I II I II II II I I I I I I I I I I I I I I I I I I I 

Db 1831 GGCTGGATGGGCACACGTTGCCACCTGCCTTGCCCAGAGGGCTTTTGGGGAGCCAACTGC 18 90 

Qy 1744 TCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAG 1803 

I I I I I I I I I I I II I I I III I I I I I I I I I I I I I I I 

Db 1891 AGCAATGCCTGTACCTGCAAGAATGGTGGCACTTGTGTACCTGAGAACGGCAACTGTGTG 1950 

Qy 1804 TGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGG 1863 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1951 TGCGCACCAGGGTTCAGAGGCCCCTCCTGCCAGAGGCCCTGCCCGCCTGGTCGCTATGGC 2010 

Qy 1864 CATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCACATC 1923 

I I I I I I II I II I I I I II I I I I 

Db 2011 AAACGCTG TGTGCCCTGCAAGTGCAACAACCATTCTTCCTGCCACCCGTCG 2061 

Qy 1924 ACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCC 1983 

II II I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 

Db 2062 GATGGGACCTGCTCCTGCCTGGCAGGCTGGACAGGCCCTGACTGCTCTGAATCATGTCCC 2121 

Qy 1984 AGTGGCAGATTTGGGAAAAACTGTGCAGGAATTTGTACCTGCACCAACAACGGAACCTGT 2043 

III III I I I I I I M M I I I I I I I 



Db 



2122 CCAGGCCACTGGGGACTCAAATGCTCCCAACCCTGCCAGTGTCATCATGGTGCCACCTGC 2181 



Qy 2044 AACCCCATTGACAGATCTTGTCAGTGTTACCCCGGTTGGATTGGCAGTGACTGCTCTCAA 2103 

I I I I I II I III II I I I I I I I I I I I I II 

Db 2182 CACCCCCAGGATGGGAGCTGTGTCTGCATCCCAGGCTGGACTGGACCCAACTGCTCGGAA 2241 

Qy 2104 C CAT GT C CAC CT G CC C ACT GG G G C C C AAACT G CAT C CACAC GT GCAACT GC C AT AAT GGA 2163 

I I I I I I I II I I I I I I IN M I I I II I I II 

Db 2242 GGCTGCCCATCAAGAATGTTTGGTGTCAACTGCTCCCAGCTATGTCAGTGTGATCCTGGA 2301 

Qy 2164 GCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTCTAC 2223 

I I I I I I I I I I I I I I I III M 

Db 2302 GAGATGTGCCACCCAGAGACTGGGGCTTGCGTCTGTCCCCCAGGACACAGTGGTGCGCAC 2361 

Qy 2224 TGCA 2227 

I I I I 

Db 2362 TGCA 2365 
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; PRIOR APPLICATION NUMBER: 09/312,359 

; PRIOR FILING DATE: 1999-05-14 

; PRIOR APPLICATION NUMBER: 09/336,536 

; PRIOR FILING DATE: 1999-06-18 

; PRIOR APPLICATION NUMBER: 09/342,687 

; PRIOR FILING DATE: 1999-06-29 

; PRIOR APPLICATION NUMBER: 09/345,464 

; PRIOR FILING DATE: 1999-06-30 

; PRIOR APPLICATION NUMBER: 09/365,164 

; PRIOR FILING DATE: 1999-07-30 

; PRIOR APPLICATION NUMBER: 09/399,723 

; PRIOR FILING DATE: 1999-09-20 

; PRIOR APPLICATION NUMBER: 09/409,634 

; PRIOR FILING DATE: 1999-09-30 

; PRIOR APPLICATION NUMBER: 09/471,179 

; PRIOR FILING DATE: 1999-12-23 



; PRIOR APPLICATION NUMBER: 09/474,071 
; PRIOR FILING DATE: 1999-12-29 
; PRIOR APPLICATION NUMBER: 09/474,072 
; PRIOR FILING DATE: 1999-12-29 

PRIOR APPLICATION NUMBER: 09/514,010 
; PRIOR FILING DATE: 2000-02-25 
; PRIOR APPLICATION NUMBER: 09/516,745 
; PRIOR FILING DATE : 2000-03-01 

PRIOR APPLICATION NUMBER: 09/572,002 

PRIOR FILING DATE: 2000-05-14 
; PRIOR APPLICATION NUMBER: 09/597,993 
; PRIOR FILING DATE: 2000-06-19 
; PRIOR APPLICATION NUMBER: 09/599,596 
; PRIOR FILING DATE: 2000-06-22 
; PRIOR APPLICATION NUMBER: 09/630,334 
; PRIOR FILING DATE: 2000-07-31 

PRIOR APPLICATION NUMBER: 09/606,565 

PRIOR FILING DATE: 2000-06-29 
; PRIOR APPLICATION NUMBER: 09/606,317 
; PRIOR FILING DATE: 2000-06-29 
; PRIOR APPLICATION NUMBER: 09/665,666 
; PRIOR FILING DATE: 2000-09-20 
; PRIOR APPLICATION NUMBER: 09/677,751 
; PRIOR FILING DATE: 2000-09-30 
; NUMBER OF SEQ ID NOS: 162 
; SEQ ID NO 123 

LENGTH: 3567 
; TYPE: DNA 

ORGANISM: Rauttus sp . 
FEATURE : 
NAME/ KEY: CDS 
LOCATION: ( 925) ... (2832 ) 
US-09-796-753-123 

Query Match 17.8%; Score 608.4; DB 10; Length 3567; 

Best Local Similarity 56.7%; Pred. No. 5.2e-186; 

Matches 1226; Conservative 0; Mismatches 901; Indels 37; Gaps 4; 

Qy 67 GCAT CACCTCT GAAT CTT GAAGACCCTAATGT GTGTAGCCACT GG GAAAGCT ACT CAGT G 12 6 

II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

Db 236 GCTGGAACACT CAACT CCAAT GATC CCAAT GT CTGT ACCTTCT GGGAAAGCTT CACCAC G 295 

Qy 127 ACT GT GC AAGAGT CAT AC C C AC AT C C C T T T GATCAAATTTACTACACGAGCTGCACT 183 

II I I I I I I I I I I I I II I I I I I I I I I 

Db 296 ACCACTAAGGAGTCCCACCTTCGCCCCTTCAGCCTGCCCCCAGCCGAGTCCTGCGACAGG 355 

Qy 184 GACATTCTAAACTGGTTTAAATGCACGCGGCACAGAGTCAGCTATCGGACAGCCTATCGA 243 

I I I I I I I I I I I I I II I I I I I I II I I I I I 

Db 356 CCCTGGGAAGACCCCCACACCTGCGCTCAGCCTACGGTTGTCTACCGGACTGTGTACCGT 415 

Qy 244 C ATGGGGAGAAGACTAT GT AT AGGC G CAAGT CT CAGT GT T GT C CT GGATT T TAT GAAAG C 303 

I I I I I I I I I I I II I I I I I I I I III I I I I I I I 

Db 416 CAGGT GGT GAAGAT G GACT CC C GC C CAC G C CT G CAGTGCT GT GGG GGT TACT ACGAGAG C 475 

Qy 304 GGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTGCTCCA 363 

III I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 476 AGTGGAGCCTGTGTCCCACTCTGTGCCCAGGAGTGTGTCCACGGTCGCTGTGTGGCTCCT 535 



Qy 364 AACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCGATGGT 423 

II I I I I I I II I I I ! I i I I I I I I I I I I I I I I I I I I I I I I I 

Db 536 AATCGGTGCCAGTGTGCACCAGGCTGGCGGGGTGACGACTGTTCCAGTGAGTGTGCTCCT 595 

Qy 424 GATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGTGCAAC 4 83 

I I I I I I I I I I I I I I I I I I I I II II I I II I 

Db 596 GGAATGTGGGGACCACAGTGTGACAGGCTCTGCCTCTGTGGCAACAGCAGTTCCTGTGAT 655 

Qy 4 84 CCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGGACCGC 543 

I I I I I I I I I I I II I I III I I I I I I II II 

Db 656 CCCAGGAGTGGGGTGTGTTTTTGCCCCTCTGGCCTGCAGCCCCCCGACTGCCTTCAGCCT 715 

Qy 544 T GT GAGCAGGGCACCTAT GGT AACGACT GTCAT CAGAGAT GCC AGT GCCAGAAT GGAGCC 603 

II I I I I II I II I I I I I I I I I I I I I I I I II I I I I 

Db 716 TGCCCCGATGGCCACTATGGTCCTGCCTGCCAGTTTGATTGCCATTGC TATGGGGCA 772 

Qy 604 ACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCTTCTGT 663 

I I I I I I I I I III III I I I I I I I I I I I I I I II I I 

Db 773 TCCTGTGACCCCCGGGATGGAGCCTGCTTCTGCCCCCCAGGGAGAACAGGACCCAGGGCA 832 

Qy 664 GAGGAT CTTT GTCCT CCT GGTAAACAT GGT CCACAGT GT GAGCAGAGAT GCCCTT GTCAA 723 

I I I I I I I III I I I I I I I I I 

Db 833 CTGATGGCTTCTTCTGCCCCAGAAC TTATCCTTGCCAA 870 

Qy 724 AAT GGAGGAGT GT GT CAT CACGT CACT GGAGAATGCT CTT GCCCTT CT GGCT GGATGGGC 783 

I I I I I I I I I I III I II III I I I I I I I I I I I I I I I I I 

Db 871 AATGGAGGTGTTCCTCAGGGCTCTCAAGGCTCCTGCAGCTGCCCACCGGGCTGGATGGGT 930 

Qy 7 84 ACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTGTTCCCAAGAATGC 8 43 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

Db 931 GT CAT CT GT T CCCT GC C AT GCC C AGAGGGT T T C C AC GGAC C CAACT GTACT C AG GAAT GT 990 

Qy 844 CAGT GC CAT AAT GGAG GGAC GT GT GAT GCT GCC ACAGG C CAAT GT CATT GCAGT C C AGGA 903 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II 

Db 991 CGTTGCCACAATGGTGGCCTTTGTGACAGGTTTACTGGGCAGTGCCACTGTGCTCCTGGC 1050 

Qy 904 T ACACAGGGGAAC GGT GC C AGGAT GAGT GT CCT GT T GGGACCT AT GGC GT T CT CT GT GCT 963 

III I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I 

Db 1051 TATATCGGGGATCGGTGCCGTGAAGAGTGCCCTGTGGGCCGCTTCGGTCAAGACTGTGCT 1110 

Qy 964 GAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTCTGT 1023 

I I I I I I I I I I I I I II I Ml I I I I I I I I I I I I I I 

Db 1111 GAGACCTGTGACTGTGCTCCTGGCGCTCGTTGCTTTCCTGCCAATGGCGCGTGTCTGTGC 117 0 

Qy 1024 GAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGC 1083 

III I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I 
Db 1171 GAACATGGCTTCACAGGCGACCGCTGCACTGAGCGACTCTGTCCAGATGGCCGCTATGGT 1230 

Qy 1084 ATCAAATGTGACAAACGGTGTCCCTGCCACTTGGAAAACACTCATAGCTGTCACCCCATG 114 3 

II II I II II I I I II I I I I I I I I II I I I I I I I I I I I I I 

Db 1231 CT GAGCTG C CAAGAT CCCT GCAC CT GC GAC C CAGAACAC AGTCT CAGCT GC CAC CCAAT G 1290 

Qy 114 4 TCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGTTCT 12 03 

I I I I I I I I I I I I M I I I I I I I I M I I I I I I I I I I I I I II II 
Db 1291 CACGGCGAGTGCTCCTGCCAGCCAGGTTGGGCGGGCCTCCACTGCAACGAGAGCTGCCCT 1350 
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Qy 
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Db 
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Db 
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Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



1204 CCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGACTGT 1263 

I I I | I I I I I I II I I I II II I I I I I I I I I I I M 

1351 CAGGACACGCACGGAGCCGGTTGCCAGGAGCACTGCCTCTGTCTGCACGGCGGTGTTTGC 1410 

1264 G ACAGT GT GACT GGAAAGT GCAC CT GT GC CC C AG GAT T C AAAGGAAT T G ACT GCT CT AC C 1323 

I I I II III I I I I I II I I I I I III I I I I I I I I I 

1411 CTCGCCGACAGCGGCCTCTGCCGGTGTGCACCTGGCTACACGGGACCTCACTGCGCTAAT 1470 

1324 CCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTATU^AATGAT 1383 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1471 CTTTGTCCACCTAACACTTATGGGATCAACTGTTCCTCCCACTGCTCCTGTGAAAATGCC 1530 

1384 GCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGAC 144 3 

I I I I I I I I I I I I I II I I I I I I I I I I II II I I 

1531 ATTGCCTGCTCTCCTGTCGACGGCACGTGCATCTGCAAGGAAGGTTGGCAGCGTGGTAAC 1590 

1444 TGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGC 1503 

I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I MINIM 

1591 TGCTCTGTGCCCTGTCCCCCTGGCACCTGGGGCTTCAGTTGCAATGCCAGTTGCCAGTGT 1650 

1504 CTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGC 1563 

I I I I I I I I I I I I I I I I I I 

1651 GCCCACGAGGGAGTCTGCAGCCCCCAAACTGGAGCCTGTACTTGCACCCCTGGGTGGCGT 1710 

1564 GGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGC 162 3 

I I I I I I I I I I I I I I I I I I I I I I I M M I I I I I I 

1711 GGGGTTCACTGCCAACTTCCGTGCCCGAAGGGACAGTTTGGTGAAGGTTGTGCCAGTGTC 177 0 

1624 TGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCG 168 3 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1771 T GT GACT GT GAC CACTC C GAT GGCT GT GAC C CT GT T CAT G GAC ACT G C C GAT GT CAG GCT 1830 

1684 GGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGC 17 4 3 

I I I I I II I M II M I I I I I I I I I I I I I I I I I I I 

1831 GGCTGGATGGGCACACGTTGCCACCTGCCTTGCCCAGAGGGCTTTTGGGGAGCCAACTGC 189 0 

1744 TCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAG 1803 

I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I 

1891 AGCAAT GCCT GT ACCT GCAAGAATGGT GGCACTT GT GTACCT GAGAACGGCAACT GT GT G 195 0 

1804 TGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGG 1863 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I 

1951 TGCGCACCAGGGTTCAGAGGCCCCTCCTGCCAGAGGCCCTGCCCGCCTGGTCGCTATGGC 2010 

1864 CAT C GCT GCAG C C AGAC AT G C C C ACAGT GC GT T C ACAGC AGC GG GC CCT GC C AC C AC AT C 1923 

I I I I I I II I I I I I I II II I I I 

2011 AAACGCTG TGTGCCCTGCAAGTGCAACAACCATTCTTCCTGCCACCCGTCG 2061 

1924 ACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAGTGTGTCCC 1983 

|| I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II 

2062 GATGGGACCTGCTCCTGCCTGGCAGGCTGGACAGGCCCTGACTGCTCTGAATCATGTCCC 2121 



1984 



2122 



AGT GG C AGAT TT GGGAAAAACT GT GCAG GAATT T GT AC CT G C AC CAACAAC GGAAC CTGT 

Ml III I I I I I I II M I I I I I I I 

CCAGGCCACTGGGGACTCAAATGCTCCCAACCCTGCCAGTGTCATCATGGTGCCACCTGC 



2043 



2181 



2044 AAC CC C AT T GACAGAT CTT GT CAGT GT T AC C C C GGT T GGAT T GGCAGT GAC TGCT CT CAA 2103 



Db 



2182 



Qy 2104 CCATGTCCACCTGCCCACTGGGGCCCAAACTGCATCCACACGTGCAACTGCCATAATGGA 2163 

M III I III I I I I I I I I I I I I I I I I II I I 

Db 2242 GGCTGCCCATCAAGAATGTTTGGTGTCAACTGCTCCCAGCTATGTCAGTGTGATCCTGGA 2301 

Qy 2164 GCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGGCTGGACAGGGCTCTAC 2223 

I I I I I I I I I I I I I II I III M 

Db 2302 GAGAT GT G C C AC C C AGAGACT GGGGCTTGCGTCTGTCCCC C AGGAC AC AGT G GT G C G C AC 2361 

Qy 2224 TGCA 2227 

I I I I 

Db 2362 TGCA 2365 



RESULT 10 
US-10-052-648A-3 

Sequence 3, Application US/10052648A 
Publication No. US20040005558A1 
GENERAL INFORMATION: 
APPLICANT: Anderson, David 
APPLICANT: Burgess, Catherine 
APPLICANT: Casman, Stacie 
APPLICANT: Colman, Steven 
APPLICANT: Edinger, Shlomit R. 
APPLICANT: Ellerman, Karen 
APPLICANT: Gerlach, Valerie 
APPLICANT: Gunther, Erik 
APPLICANT: Kekuda, Ramesh 
APPLICANT: MacDougall, John R. 
APPLICANT: Mehraban, Fuad 
APPLICANT: Patturajan, Meera 
APPLICANT: Rothenberg, Mark 
APPLICANT: Shimkets, Richard 
APPLICANT: Smithson, Glennda 
APPLICANT: Spytek, Kimberly A. 
APPLICANT: Stone, David J. 
APPLICANT: Vernet, Corine A.M. 
APPLICANT: Zerhusen, Bryan D. 

TITLE OF INVENTION: PROTEINS, POLYNUCLEOTIDES ENCODING THEM AND METHODS OF 
TITLE OF INVENTION: USING THE SAME 
FILE REFERENCE: 21402-250 (CURA-550) 
CURRENT APPLICATION NUMBER: US/ 10/ 052 , 64 8A 
CURRENT FILING DATE: 2002-12-09 
PRIOR APPLICATION NUMBER: 60/262,454 
PRIOR FILING DATE: 2001-01-18 
PRIOR APPLICATION NUMBER: 60/272,920 
PRIOR FILING DATE: 2001-03-02 
PRIOR APPLICATION NUMBER: 60/284,549 
PRIOR FILING DATE: 2001-04-18 
PRIOR APPLICATION NUMBER: 60/303,229 
PRIOR FILING DATE: 2001-07-05 
PRIOR APPLICATION NUMBER: 60/262,892 
PRIOR FILING DATE: 2001-01-19 
PRIOR APPLICATION NUMBER: 60/263,605 
PRIOR FILING DATE: 2001-01-23 



; PRIOR APPLICATION NUMBER: 60/269,098 

; PRIOR FILING DATE: 2001-02-15 

; PRIOR APPLICATION NUMBER: 60/264,159 

; PRIOR FILING DATE: 2001-01-25 

; PRIOR APPLICATION NUMBER: 60/265,517 

; PRIOR FILING DATE: 2001-01-31 

; PRIOR APPLICATION NUMBER: 60/271,855 

; PRIOR FILING DATE: 2001-02-27 

; Remaining Prior Application data removed - See File Wrapper or PALM. 

; NUMBER OF SEQ ID NOS : 97 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 3 

; LENGTH: 2919 

TYPE: DNA 
; ORGANISM: Homo sapiens 
US-10-052-648A-3 

Query Match 16.0%; Score 546.2; DB 15; Length 2919; 

Best Local Similarity 55.3%; Pred. No. 8e-166; 

Matches 1231; Conservative 0; Mismatches 903; Indels 93; Gaps 5; 

Qy 62 GGACAGCAT CACCT CT GAAT CTT GAAGACCCT AAT GT GT GT AGCCACT GGGAAAGCTACT 121 

II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 129 GGCTGGCTGGAACTCTCAACCCCAGTGATCCCAATACCTGCAGCTTCTGGGAAAGCTTCA 188 

Qy 122 CAGT GACT GTGCAAGAGT CATACCCACATCCCTTT GAT CAAATTTACT ACACGAGCTGC- 180 

I II I I I I I I I I I I I I I I I I I II I I I I I 

Db 18 9 CTACCACCACCAAGGAGTCCCACTCCCGCCCCTTCAGCCTGCTCCCCTCAGAGCCCTGCG 248 

Qy 181 — AC T G AC AT T C T AAAC T G GT T T AAAT G C AC G C G G C AC AG AGT C AG C TAT C G G AC AG C C T 2 38 

I I II I I I I I I I I I II I I I I I I I I i 

Db 249 AGCGGCCCTGGGAGGGCCCCCATACTTGCCCCCAGCCCACGGTTGTATACCGGACCGTGT 308 

Qy 239 ATCGACATGGGGAGAAGACTATGTATAGGCGCAAGTCTCAGTGTTGTCCTGGATTTTATG 2 98 

I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I 

Db 309 ACCGTCAGGTGGTGAAGACGGACCACCGCCAGCGCCTGCAGTGCTGCCATGGCTTCTATG 368 

Qy 299 AAAGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTG 358 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

Db 369 AGAGCAGGGGGTTCTGTGTCCCGCTCTGTGCCCAGGAGTGTGTCCATGGCCGTTGTGTGG 428 

Qy 359 CTCCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCG 418 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 429 CACCCAATCAGTGCCAATGTGTGCCAGGCTGGCGGGGCGACGACTGTTCCAGTGAGTGTG 4 88 

Qy 419 ATGGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGT 478 

I I I I I I I I I I I I I I I I I I III II II 

Db 489 CCCCAGGAATGTGGGGGCCACAGTGTGACAAGCCCTGCAGCTGCGGCAACAACAGCTCGT 54 8 

Qy 479 GCAACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGG 538 

I I I I I I I I I I I II II I I I I I III I I I I I I 

Db 549 GTGATCCCAAGAGTGGGGTATGTTCTTGCCCTTCTGGTCTGCAGCCCCCGAACTGCCTTC 608 

Qy 539 ACCGCT GT GAGCAGGGCACCTATGGTAACGACTGT CAT CAGAGAT GCCAGT GCCAGAATG 598 

I I I I II I III I I I I I I I I I I I I I I I I I I I I I I I III 
Db 609 AGCCCTGTACCCCTGGCTACTATGGCCCTGCCTGCCAGTTCCGCTGCCAGTGCC ATG 665 



Qy 599 GAGCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCT 658 

III I I I I I I I I I Mill III I I I I II III I I I I M 

Db 666 GGGCACCCTGCGATCCCCAGACTGGAGCCTGCTTCTGCCCCGCAGAGAGAACTGGGCCCA 725 

Qy 659 TCTGTGAGGATCTTTGTCCTCCTGGTAAACATGGTCCACAGTGTGAGCAGAGATGCCCTT 718 

I I I I I I I I I I I I I I I III II I Ml 

Db 726 GCTGTGACGTGTCCTGTTCCCAGGGCACTTCTGGCTTCTTCTGCCCCAGCACCCATTCTT 785 

Qy 719 GT C AAAAT G GAGGAGT GT GT CAT C AC GT C ACT GGAGAAT GCTCTTGCCCTTCTGGCT GGA 778 

I I I I I I I I I II I II III I I I I I II I I I I I I I 

Db 7 86 GCCAAAATGGAGGTGTCTTCCAAACCCCACAGGGCTCCTGCAGCTGCCCCCCTGGCTGGA 845 

Qy 779 T 779 

I 

Db 846 TGGTATGGAGGGTGGGGCCTGTGGGCATGGGGTGTGGGTCTGGGGAGAATTCTGTGGGTG 905 

Qy 780 GGGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTG 820 

I I I I I I III I I I I I I I I I I I I I I I 

Db 906 GTGCTAAGCAGGGCTCCAAGGGCACCATCTGCTCCCTGCCCTGCCCAGAGGGCTTTCACG 965 

Qy 821 GAAAGAACTGTTCCCAAGAAT GC CAGT GCCATAATGGAGGGACGT GT GAT GCT GCCACAG 880 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 966 GACCCAACTGCTCCCAGGAATGTCGCTGCCACAACGGCGGCCTCTGTGACCGATTCACTG 1025 

Qy 881 GC CAAT GTCAT T GCAGT CCAGGATAC ACAGGGGAAC G GT GC CAGGAT GAGT GT C CT GT T G 940 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1026 GGCAGTGCCGCTGCGCTCCGGGTTACACTGGGGATCGGTGCCGGGAGGAGTGCCCGGTGG 1085 

Qy 941 GGACCT AT GGCGTT CT CT GT GCT GAGACCT GCCAGT GTGT CAACGGAGGGAAGT GTT ACC 1000 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 108 6 GCCGCTTTGGGCAGGACTGTGCTGAGACGTGCGACTGCGCCCCGGACGCCCGTTGCTTCC 1145 

Qy 1001 ACGTGAGCGGCGCATGCCTCTGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCC 1060 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1146 CGGCCAACGGCGCATGTCTGTGCGAACACGGCTTCACTGGGGACCGCTGCACGGATCGCC 1205 

Qy 1061 TGTGTCCTGAGGGGCTCTACGGCATCAAATGTGACAAACGGTGTCCCTGCCACTTGGAAA 1120 

I I I I I I I I I I I I I II I III II I I I I I I I I I I IN 

Db 12 06 TCTGCCCCGACGGCTTCTACGGTCTCAGCTGCCAGGCCCCCCGCACCTGCGACCGGGAGC 1265 

Qy 1121 AC ACT CAT AG C T GT C AC C C CAT GT C T G G AG AGT GT G C C T G C AAG CCGGGCTGGT C AG G AC 1180 

Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 1266 ACAGCCTCAGCTGCCACCCGATGAACGGGGAGTGCTCCTGCCTGCCGGGCTGGGCGGGCC 1325 

Qy 1181 T CT ACT GTAAT GAGACAT GT T CT C CT GGAT T CT ACGG GGAAGCT T GC CAGCAGAT C T GC A 1240 

I I I I I I I I I I I I II II I I II I II I I I I I I I I III 

Db 1326 TCCACTGCAACGAGAGCTGCCCGCAGGACACGCATGGGCCAGGGTGCCAGGAGCACTGTC 1385 

Qy 1241 GCT GCCAAAAT GGGGCAGACT GT GACAGT GT GACT GGAAAGT GCACCT GTGCCCCAGGAT 1300 

I I I I I I I I I I II I I I I II M I I I I I I I I I 

Db 138 6 TCTGCCTGCACGGTGGCGTCTGCCAGGCTACCAGCGGCCTCTGTCAGTGCGCGCCGGGTT 1445 

Qy 1301 T CAAAGGAATT GACTG C T CTAC C CCAT GC C CT CT GG GAAC C TAT GGGAT AAAC T GT T C C T 1360 

II II I I I II III I II I I I I I I I I I I I I I I I I I I I I I 

Db 144 6 ACACGGGCCCTCACTGTGCTAGTCTTTGTCCTCCTGACACCTACGGTGTCAACTGTTCTG 1505 

Qy 1361 CT C GCT GT GGCT GTAAAAAT GAT GCAGT CT GCT CT C CT GT GGAC GGGT CT T GT ACTT GC A 1420 



Ill 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 I 1 1 1 1 1 II I II I 



Db 


1506 


CACGCTGCTCATGTGAAAATGCCATCGCCTGCTCACCCATCGACGGCGAGTGCGTCTGCA 


1565 


Qy 


1421 


AGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTG 

1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

AGGAAGGTTGGCAGCGTGGTAACTGCTCTGTGCCCTGCCCACCCGGAACCTGGGGCTTCA 


1480 


Db 


1566 


1625 


Qy 


1481 


GCTGTAACTTAACATGCCAGTGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCT 

1 1 1 1 1 1 MINIM 1 1 1 1 1 1 1 II 1 II 1 M 

GTTGCAATGCCAGCTGCCAGTGTGCCCATGAGGCAGTCTGCAGCCCCCAAACTGGAGCCT 


1540 


Db 


1626 


1685 


Qy 


1541 


GCACGTGTGCACCTGGATGGCGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGT 

1 1 1 1 1 II Ill 1 1 1 1 1 1 1 1 1 1 1 1 II II 

GTACCTGCACCCCTGGGTGGCATGGGGCCCACTGCCAGCTGCCCTGTCCGAAGGGGCAGT 


1600 


Db 


1686 


1745 


Qy 


1601 


ACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCA 

|| Mill 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

TTGGAGAAGGTTGTGCCAGTCGCTGTGACTGTGACCACTCTGATGGCTGTGACCCTGTTC 


1660 


Db 


1746 


1805 


Qy 


1661 


CGGGCCATTGCCGCTGCCTCCCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTG 

Ml Ml 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II M Ml 

ATGGACGCTGTCAGTGCCAGGCTGGCTGGATGGGTGCCCGCTGCCACCTGTCCTGCCCTG 


1720 


Db 


1806 


1865 


Qy 

Db 


1721 
1866 


AGGGACGCTGGGGCCCC7\ACTGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCT 

I I I I 1 1 1 II 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 Ml 1 1 1 

AGGGCTTATGGGGAGTCAACTGTAGCAACACCTGCACCTGCAAGAATGGGGGCACCTGTC 


1780 
1925 


Qy 


1781 


CCCCTGATGATGGCATCTGCGAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGA 

I I 1 I 1 1 MINI 1 1 1 1 1 1 1 1 1 II II II 1 1 1 1 1 III II 1 II 1 1 1 1 1 

TCCCTGAGAATGGCAACTGCGTGTGTGCGCCCGGATTCCGGGGCCCCTCCTGCCAGAGAT 


1840 


Db 


1926 


1985 


Qy 


1841 


TCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACA 

I M 1 1 1 1 1 1 1 1 1 1 1 Mill 1 1 1 1 1 1 II 1 M 
CCTGTCAGCCTGGCCGCTATGGCAAACGCTGTGTGC CCTGCAAGTGCGCTAACC 


1900 


Db 


1986 


2039 


Qy 


1901 


GCAGCGGGCCCTGCCACCACATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCG 

1 1 II 1 1 1 Ml M 1 1 1 1 1 M 1 1 1 MM 

ACTCC TTCTGCCACCCCTCGAACGGGGCCTGCTACTGCCTGGCTGGCTGGACAGGCC 


1960 


Db 


2040 


2096 


Qy 


1961 


CCCTCTGCAATGAAGTGTGTCCCAGTGGCAGATTTGGGAAAAACTGTGCAGGAATTTGTA 

II I 1 1 1 1 1 1 M 1 M 1 1 1 1 1 1 1 1 M 

CCGACTGCTCCCAGCCATGCCCTCCAGGACACTGGGGAGAAAACTGTGCCCAGACCTGCC 


2020 


Db 


2097 


2156 


Qy 


2021 


C CT G C AC C AAC AAC G GAAC C T GT AAC C C CAT T GACAGAT CT T GT C AGT G T T AC C C C G GT T 

II II II II 1 1 1 1 1 1 1 M 1 III II M Ml 

AATGTCACCATGGTGGGACCTGCCATCCCCAGGATGGGAGCTGTATCTGCCCCCTAGGCT 


2080 


Db 


2157 


2216 


Qy 


2081 


GGATTGGCAGTGACTGCTCTCAACCATGTCCACCTGCCCACTGGGGCCCAAACTGCATCC 

II 1 1 II MIMI M 1 1 M 1 1 1 

GGACTGGACACCACTGCTTAGAAGGCTGCCCTCTGGGGACATTTGGTGCTAACTGCTCCC 


2140 


Db 


2217 


2276 


Qy 


2141 


ACAC GT GCAACT GC CAT AAT G GAG CTT T CT GCAGC GCCT AC GAT GGGGAAT GT AAAT GCA 

1 1 II 1 1 1 1 1 II 1 II III M 1 IMM II 1 III 

AGC CAT G C CAGT GT GGT C CT GGAGAAAAGT GC CAC CC AGAGACT G GGG C CT GT GT AT GT C 


2200 


Db 


2277 


2336 


Qy 


2201 


CTCCTGG 2207 





I 1 1 1 1 



Db 2337 CCCCAGG 2343 



RESULT 11 
US-10-052-648A-5 

Sequence 5, Application US/10052648A 
Publication No. US20040005558A1 
GENERAL INFORMATION: 
APPLICANT: Anderson, David 
APPLICANT: Burgess, Catherine 
APPLICANT: Casman, Stacie 
APPLICANT: Colman, Steven 
APPLICANT: Edinger, Shlomit R. 
APPLICANT: Ellerman, Karen 
APPLICANT: Gerlach, Valerie 
APPLICANT: Gunther, Erik 
APPLICANT: Kekuda, Ramesh 
APPLICANT: MacDougall, John R . 
APPLICANT: Mehraban, Fuad 
APPLICANT: Patturajan, Meera 
APPLICANT: Rothenberg, Mark 
APPLICANT: Shimkets, Richard 
APPLICANT: Smithson, Glennda 
APPLICANT: Spytek, Kimberly A. 
APPLICANT: Stone, David J. 
APPLICANT: Vernet, Corine A.M. 
APPLICANT: Zerhusen, Bryan D. 

TITLE OF INVENTION: PROTEINS, POLYNUCLEOTIDES ENCODING THEM AND METHODS OF 
TITLE OF INVENTION: USING THE SAME 
FILE REFERENCE: 21402-250 (CURA-550) 
CURRENT APPLICATION NUMBER: US/ 10/052 , 64 8A 
CURRENT FILING DATE: 2002-12-09 
PRIOR APPLICATION NUMBER: 60/262,454 
PRIOR FILING DATE: 2001-01-18 
PRIOR APPLICATION NUMBER: 60/272,920 
PRIOR FILING DATE: 2001-03-02 
PRIOR APPLICATION NUMBER: 60/284,549 
PRIOR FILING DATE: 2001-04-18 
PRIOR APPLICATION NUMBER: 60/303,229 
PRIOR FILING DATE: 2001-07-05 
PRIOR APPLICATION NUMBER: 60/262,892 
PRIOR FILING DATE: 2001-01-19 
PRIOR APPLICATION NUMBER: 60/263,605 
PRIOR FILING DATE: 2001-01-23 
PRIOR APPLICATION NUMBER: 60/269,098 
PRIOR FILING DATE: 2001-02-15 
PRIOR APPLICATION NUMBER: 60/264,159 
PRIOR FILING DATE: 2001-01-25 
PRIOR APPLICATION NUMBER: 60/265,517 
PRIOR FILING DATE: 2001-01-31 
PRIOR APPLICATION NUMBER: 60/271,855 
PRIOR FILING DATE: 2001-02-27 

Remaining Prior Application data removed - See File Wrapper or PALM. 
NUMBER OF SEQ ID NOS : 97 
SOFTWARE: PatentlnVer. 2.1 
SEQ ID NO 5 
LENGTH: 2919 



TYPE: DNA 

ORGANISM: Homo sapiens 
US-10-052-648A-5 



Query Match 16.0%; Score 546.2; DB 15; Length 2919; 

Best Local Similarity 55.3%; Pred. No. 8e-166; 

Matches 1231; Conservative 0; Mismatches 903; Indels 93; Gaps 5; 

Qy 62 GGACAGCAT C ACCT CT GAAT CTT GAAGAC C CTAAT GT GT GTAGC CACT GGGAAAGCT ACT 121 

II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 129 GGCTGGCTGGAACTCTCAACCCCAGTGATCCCAATACCTGCAGCTTCTGGGAAAGCTTCA 188 

Qy 122 C AGT GACT GT GC AAGAGT CAT AC C C AC AT C CCT T T GAT C AAATT T AC T AC AC GAGCT GC - 180 

I II I I II I I I I I I I I I I I I I II I I I I I 

Db 189 CTACCACCACCAAGGAGTCCCACTCCCGCCCCTTCAGCCTGCTCCCCTCAGAGCCCTGCG 24 8 

Qy 181 — ACT GACATT CT AAACT GGTTTAAAT GCAC GCGGCACAGAGTCAGCT AT C GGACAGCCT 238 

I I II I I I I I I I I I II I I I I I I I I I 

Db 249 AGCGGCCCTGGGAGGGCCCCCATACTTGCCCCCAGCCCACGGTTGTATACCGGACCGTGT 308 

Qy 239 ATCGACATGGGGAGAAGACTATGTATAGGCGCAAGTCTCAGTGTTGTCCTGGATTTTATG 298 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 309 ACCGTCAGGTGGTGAAGACGGACCACCGCCAGCGCCTGCAGTGCTGCCATGGCTTCTATG 368 

Qy 299 AAAGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTG 358 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 369 AGAGCAGGGGGTTCTGTGTCCCGCTCTGTGCCCAGGAGTGTGTCCATGGCCGTTGTGTGG 428 

Qy 359 CTCCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCG 418 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 429 C AC CCAAT CAGT GC CAATGT GT GC C AGGCT GGC G GG G C GAC GACT GT T C C AGT GAGT GT G 488 

Qy 419 ATGGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGT 47 8 

I I II II I I I I I I II I III III M M 

Db 489 CCCCAGGAATGTGGGGGCCACAGTGTGACAAGCCCTGCAGCTGCGGCAACAACAGCTCGT 54 8 

Qy 479 GCAACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGG 538 

I I I I I I I I I I I II II I I I I I III I I I I I I 

Db 549 GTGATCCCAAGAGTGGGGTATGTTCTTGCCCTTCTGGTCTGCAGCCCCCGAACTGCCTTC 608 

Qy 539 AC CGCT GT GAGCAGGGCACCTAT GGTAACGACT GT C AT CAGAGAT GC CAGT GCCAGAAT G 598 

I I I I I I I III I I I I I I I I I I I I I I I II I I I I I I III 

Db 609 AGCCCTGTACCCCTGGCTACTATGGCCCTGCCTGCCAGTTCCGCTGCCAGTGCC ATG 665 

Qy 599 GAGC C AC CT G C GAC CAC GT C AC GGG GGAAT GCCGCTGCC C AC C AGGAT AC AC C G GAG CCT 658 

III I I I I I I I I I I I I I I III I I I I I I III I I II II 

Db 666 GGGCACCCTGCGATCCCCAGACTGGAGCCTGCTTCTGCCCCGCAGAGAGAACTGGGCCCA 725 

Qy 659 TCT GT GAGGAT CTTT GTCCT CCT GGT AAACAT GGTC CACAGT GT GAGCAGAGAT GC CCTT 718 

I I I I I I I I I I I I I I I III II I Ml 

Db 726 GCTGTGACGTGTCCTGTTCCCAGGGCACTTCTGGCTTCTTCTGCCCCAGCACCCATTCTT 785 

Qy 719 GTCAAAATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGA 77 8 

I I I I I I I I II I II I I I I I I I I I I I II I I I I 

Db 786 GCCAAAATGGAGGTGTCTTCCAAACCCCACAGGGCTCCTGCAGCTGCCCCCCTGGCTGGA 845 

Qy 779 T 779 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



846 TGGTATGGAGGGTGGGGCCTGTGGGCATGGGGTGTGGGTCTGGGGAGAATTCTGTGGGTG 905 



780 



906 



821 



966 



GGGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTG 

I I I I I I III I I I I I I II I I I I I I I 

GTGCTAAGCAGGGCTCCAAGGGCACCATCTGCTCCCTGCCCTGCCCAGAGGGCTTTCACG 



820 



965 



880 



881 



GAAAGAACT GT T C C CAAGAAT GC CAGT G C CATAAT GGAGGGAC GT GT GAT G CT GC CAC AG 

II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 

GACCCAACTGCTCCCAGGAATGTCGCTGCCACAACGGCGGCCTCTGTGACCGATTCACTG 1025 



940 



GCCAATGTCATTGCAGTCCAGGATACACAGGGGAACGGTGCCAGGATGAGTGTCCTGTTG 

I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1026 GGCAGTGCCGCTGCGCTCCGGGTTACACTGGGGATCGGTGCCGGGAGGAGTGCCCGGTGG 1085 

941 GGACCTATGGCGTTCTCTGTGCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACC 1000 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1086 GCCGCTTTGGGCAGGACTGTGCTGAGACGTGCGACTGCGCCCCGGACGCCCGTTGCTTCC 1145 

1001 ACGTGAGCGGCGCATGCCTCTGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCC 1060 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I MM 
1146 CGGCCAACGGCGCATGTCTGTGCGAACACGGCTTCACTGGGGACCGCTGCACGGATCGCC 1205 

1061 TGTGTCCTGAGGGGCTCTACGGCATCAAATGTGACAAACGGTGTCCCTGCCACTTGGAAA 112 0 

I I I I I I I II M I I I I I I I I II I I I I I I M II III 

12 06 TCTGCCCCGACGGCTTCTACGGTCTCAGCTGCCAGGCCCCCCGCACCTGCGACCGGGAGC 1265 

1121 ACACTCATAGCTGTCACCCCATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGAC 1180 

I I I I I M I I I I I I I II I I I I II I I I II I I I I M II I I II I II I 

1266 ACAGCCTCAGCTGCCACCCGATGAACGGGGAGTGCTCCTGCCTGCCGGGCTGGGCGGGCC 1325 

1181 TCT ACT GTAAT GAGACAT GTT CT CCTGGATT CT ACGGGGAAGCTT GCCAGCAGAT CT GCA 124 0 

II I I I I I I II I I II II I I M I M I II I M II III 

1326 TCCACTGCAACGAGAGCTGCCCGCAGGACACGCATGGGCCAGGGTGCCAGGAGCGCTGTC 1385 

1241 GCTGCCAAAATGGGGCAGACTGTGACAGTGTGACTGGAAAGTGCACCTGTGCCCCAGGAT 1300 

Mill I II I I Ml I I I II II M II I I M I 

1386 TCTGCCTGCACGGTGGCGTCTGCCAGGCTACCAGCGGCCTCTGTCAGTGCGCGCCGGGTT 144 5 

1301 TCAAAGGAATTGACTGCTCTACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCT 1360 

II II M I II II I I I I M II I I I II I I II II 

1446 ACACGGGCCCTCACTGTGCTAGTCTTTGTCCTCCTGACACCTACGGTGTCAACTGTTCTG 1505 

1361 CTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCA 1420 

I II I I I I I I M I I I I I II I I II II I I I II I II MM 

1506 CACGCTGCTCATGTGAAAATGCCATCGCCTGCTCACCCATCGACGGCGAGTGCGTCTGCA 1565 

1421 AGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTG 1480 

I I I I I I I I II I M I Ml 

1566 AGGAAGGTTGGCAGCGTGGTAACTGCTCTGTGCCCTGCCCACCCGGAACCTGGGGCTTCA 1625 

14 81 GCTGTAACTTAACATGCCAGTGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCT 1540 

I I I I I I I II II I II I I I I I I I I II II I II I II III 

162 6 GTTGCAATGCCAGCTGCCAGTGTGCCCATGAGGCAGTCTGCAGCCCCCAAACTGGAGCCT 1685 



1541 GCACGTGTGCACCTGGATGGCGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGT 
I I I I II II I I I II I I I M I II I I I I I II 



1600 



Db 1686 GTACCTGCACCCCTGGGTGGCATGGGGCCCACTGCCAGCTGCCCTGTCCGAAGGGGCAGT 1745 

Qy 1601 ACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCA 1660 

II I I I I I I I I I I II I I Ml 

Db 1746 TTGGAGAAGGTTGTGCCAGTCGCTGTGACTGTGACCACTCTGATGGCTGTGACCCTGTTC 1805 

Qy 1661 CGGGCCATTGCCGCTGCCTCCCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTG 172 0 

III III I II I I I I I I I I I I I I I I I I II II I I I 

Db 1806 ATGGACGCTGTCAGTGCCAGGCTGGCTGGATGGGTGCCCGCTGCCACCTGTCCTGCCCTG 1865 

Qy 1721 AGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCT 1780 

MM II I I I I I M II I I I I I III 

Db 1866 AGGGCT TAT G GGGAGT CAACTGT AGCAACAC CT GCAC C T GCAAGAAT GGG GGC ACCT GT C 1925 

Qy 1781 CCCCTGATGATGGCATCTGCGAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGA 1840 

II I II I II I I II Mill I I I I I I II II Mill II I II I II Mill 

Db 1926 TCCCTGAGAATGGCAACTGCGTGTGTGCGCCCGGATTCCGGGGCCCCTCCTGCCAGAGAT 1985 

Qy 1841 TCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACA 19 00 

III I I II I I I I I I I I II I I I I I I I I I I I II 

Db 1986 CCTGTCAGCCTGGCCGCTATGGCAAACGCTGTGTGC CCTGCAAGTGCGCTAACC 2039 

Qy 1901 GCAGCGGGCCCTGCCACCACATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCG 1960 

I I I I I I I I II M I II I I M I I M I I I II I II 

Db 2040 ACTCC TTCTGCCACCCCTCGAACGGGGCCTGCTACTGCCTGGCTGGCTGGACAGGCC 2096 

Qy 1961 C C CT CT GCAAT GAAGT GT GT C C C AGT GG C AGAT T T GGGAAAAACT GT GC AGGAAT T T GT A 2020 

MUM I MM II I II I I II I I I II I III 

Db 2097 CCGACTGCTCCCAGCCATGCCCTCCAGGACACTGGGGAGAAAACTGTGCCCAGACCTGCC 2156 

Qy 2021 C CT GCAC CAAC AAC GGAAC CT GT AAC C C CAT T GAC AG AT CT T GT C AGT GT T AC C C C GGT T 2080 

II II I M II I III M M Ml 

Db 2157 AATGTCACCATGGTGGGACCTGCCATCCCCAGGATGGGAGCTGTATCTGCCCCCTAGGCT 2216 

Qy 2081 GGATTGGCAGTGACTGCTCTCAACCATGTCCACCTGCCCACTGGGGCCCAAACTGCATCC 2140 

Ml I II II I I I II I M I M I I I 

Db 2217 GGACTGGACACCACTGCTTAGAAGGCTGCCCTCTGGGGACATTTGGTGCTAACTGCTCCC 2276 

Qy 2141 ACACGTGCAACTGCCATAATGGAGCTTTCTGCAGCGCCTACGATGGGGT^ATGTAAATGCA 22 00 

I I I I I I I I I Mill II I II I II II M 

Db 2277 AGCCAT GC CAGT GT GGT C CT GGAGAAAAGT GC CACC CAGAGAC T GG GGC CT GT GT AT GT C 2336 

Qy 2201 CTCCTGG 2207 

Mill 

Db 2337 CCCCAGG 2 343 



RESULT 12 

US-10-106-698-1976/C 

; Sequence 1976, Application US/10106698 

; Publication No. US20030109690A1 

; GENERAL INFORMATION: 

; APPLICANT: Ruben et al . 

; TITLE OF INVENTION: Colon and Colon Cancer Associated Polynucleotides and 
Polypeptides 

; FILE REFERENCE: PA005P1 

; CURRENT APPLICATION NUMBER: US/10/106,698 



; CURRENT FILING DATE: 2002-03-27 

PRIOR APPLICATION NUMBER: PCT/US00/26524 
; PRIOR FILING DATE: 2000-09-28 
; PRIOR APPLICATION NUMBER: US 60/157 , 137 

PRIOR FILING DATE: 1999-09-29 
; PRIOR APPLICATION NUMBER: US 60/163,280 
; PRIOR FILING DATE: 1999-11-03 
; NUMBER OF SEQ ID NOS: 8564 
; SOFTWARE: Patentln Ver. 3.0 
; SEQ ID NO 1976 

LENGTH: 1970 

TYPE: DNA 
; ORGANISM: Homo sapiens 

FEATURE : 
; NAME/KEY: misc_f eature 

LOCATION: (7) . . (7) 
; OTHER INFORMATION: n equals a,t,g, or c 
; NAME/KEY: misc_f eature 

LOCATION: ( 18 90 ) . . ( 18 90 ) 

OTHER INFORMATION: n equals a,t,g, or c 

NAME/KEY: mis c_f eature 

LOCATION: ( 1970 )..( 1970 ) 
; OTHER INFORMATION: n equals a,t,g, or c 
US-10-106-698-1976 



Query Match 15.8%; Score 540.6; DB 14; Length 1970; 

Best Local Similarity 69.4%; Pred. No. 4.1e-164; 

Matches 735; Conservative 0; Mismatches 324; Indels 0; Gaps 0; 



Qy 


917 


GGTGCCAGGATGAGTGTCCTGTTGGGACCTATGGCGTTCTCTGTGCTGAGACCTGCCAGT 

1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 IMII II 1 M in II 
GGTGCCAAGAGGAGTGCCCCTTCGGGTCCTTCGGCTTCCAGTGCTCACAGCGCTGTGACT 


976 


Db 


1498 


1439 


Qy 


977 


GTGTCAACGGAGGGAAGTGTTACCACGTGAGCGGCGCATGCCTCTGTGAAGCAGGCTTTG 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 

GCCACAATGGGGGGCAGTGTTCACCCACCACGGGTGCCTGCGAGTGTGAGCCTGGCTACA 


1036 


Db 


1438 


1379 


Qy 


1037 


CTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAATGTGACA 

Ml 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 

AGGGCCCACGCTGCCAGGAGCGACTGTGCCCGGAGGGCCTGCATGGCCCAGGCTGCACCC 


1096 


Db 


1378 


1319 


Qy 


1097 


AAC GGTGTCCCTGC CACTT GGAAAAC ACT C AT AGCT GT CAC C C CAT GT CT GGAGAGT GT G 

I 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 MIM III 
TGCCCTGCCCCTGTGACGCTGACAACACCATCAGCTGCCACCCAGTAACTGGAGCTTGTA 


1156 


Db 


1318 


1259 


Qy 


1157 


CCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAATGAGACATGTTCTCCTGGATTCTACG 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M III II 1 II 1 1 1 1 1 

CCTGCCAGCCAGGCTGGTCTGGTCACCACTGCAATGAATCCTGCCCTGTTGGCTACTATG 


1216 


Db 


1258 


1199 


Qy 


1217 


GGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAATGGGGCAGACTGTGACAGTGTGACTG 

1 I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

GCGATGGCTGCCAGCTGCCTTGCACCTGTCAGAATGGCGCCGACTGCCACAGCATCACTG 


1276 


Db 


1198 


1139 


Qy 


1277 


GAAAGT GC ACCT GT GC C C C AG GAT T C AAAGGAAT T GACT GCT CT AC C C CAT GCCCTCTGG 

1 I I I 1 1 1 II 1 1 1 1 1 1 1 1 1 1 III 1 1 1 1 1 1 M 1 1 

GGGGCTGCACTTGTGCTCCGGGCTTCATGGGAGAGGTCTGTGCCGTTTCCTGTGCAGCAG 


1336 


Db 


1138 


1079 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1337 GAACCTATGGGATAAACTGTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCTC 1396 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I 

1078 GGACCTATGGCCCCAACTGCTCGTCCATCTGTAGCTGTAACAATGGTGGCACCTGCTCCC 1019 

1397 CTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGAT 1456 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II 

1018 CAGTAGATGGCTCCTGTACCTGCAAGGAAGGGTGGCAGGGCCTGGACTGCACCCTGCCAT 959 

1457 GTCCCAGTGGCACATGGGGCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGGGAG 1516 
I I I I I I I I I I I I I I I I I I I I I I I I I I III II I I I I I I I I I 
GTCCCAGTGGGACGTGGGGCCTGAACTGCAACGAGAGCTGCACCTGTGCCAATGGGGCAG 



958 



899 



1517 CCTGCAACACCCTGGACGGGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAATGCG 1576 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I III 

898 CCTGCAGCCCCATAGACGGCTCCTGCTCCTGCACTCCTGGCTGGCTGGGAGACACCTGTG 839 

1577 AACTTCCCTGCCAGGATGGCACGTACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGCC 1636 
I II II I I I I I I I I I I I I I I I II I I I I I I I I III I III I I I I I I I I I I 
AGCTGCCTTGCCCGGATGGCACATTTGGGCTGAACTGCAGTGAACACTGTGACTGCAGCC 



838 



779 



1696 



1637 ACGCAGATGGCTGCCACCCTACCACGGGCCATTGCCGCTGCCTCCCGGGATGGTCAGGTG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
778 ATGCTGATGGATGTGACCCCGTCACAGGCCACTGCTGCTGCCTGGCCGGATGGACAGGCA 719 

1697 TCCACTGTGACAGCGTGTGTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGCT 1756 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

718 TCCGCTGTGACAGCACGTGTCCACCTGGCCGCTGGGGCCCCAACTGCTCTGTCTCCTGCA 659 

1757 ACTGTAAAAATGGGGCTTCATGCTCCCCTGATGATGGCATCTGCGAGTGTGCACCAGGCT 1816 

I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I II I I I I I I I II I I I I 

GCTGTGAGAATGGAGGCTCCTGCTCCCCAGAGGATGGGAGCTGCGAGTGTGCCCCTGGCT 



658 



599 



1817 TCCGAGGCACCACTTGTCAGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGCC 1876 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II 

TCCGAGGACCCTTATGCCAGAGAATCTGCCCCCCTGGGTTCTATGGCCACGGCTGCGCCC 



598 



539 



1877 AGACATGCCCACAGTGCGTTCACAGCAGCGGGCCCTGCCACCACATCACCGGCCTGTGTG 1936 
II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
538 AGCCATGCCCCCTCTGCGTGCACAGCAGCAGGCCCTGCCACCACATCAGCGGCATCTGTG 479 

1937 ACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAATGAAG 1975 
I I I I I I I I I I I I I I I I I I I I I I I I I III 
478 AGT GC CT CC CAGGAT T CT CT G GAGCT C T CT GCAAC CAAG 440 



RESULT 13 
US-09-796-753-113 

; Sequence 113, Application US/09796753 

; Publication No. US20030027998A1 

; GENERAL INFORMATION: 

; APPLICANT: McCarthy, Sean A. 

; TITLE OF INVENTION: SECRETED PROTEINS AND USES THEREOF 

; FILE REFERENCE: 7853-227-999 

; CURRENT APPLICATION NUMBER: US/09/7 96,7 53 

; CURRENT FILING DATE: 2001-03-01 

; PRIOR APPLICATION NUMBER: 09/183,175 

; PRIOR FILING DATE: 1998-10-30 



PRIOR APPLICATION NUMBER: 09/223,094 
PRIOR FILING DATE: 1998-12-30 
PRIOR APPLICATION NUMBER: 09/223,546 
PRIOR FILING DATE: 1998-12-30 
PRIOR APPLICATION NUMBER: 09/224,246 
PRIOR FILING DATE: 1998-12-30 
PRIOR APPLICATION NUMBER: 09/259,388 
PRIOR FILING DATE: 1999-02-26 
PRIOR APPLICATION NUMBER: 60/122,458 
PRIOR FILING DATE: 1999-03-01 
PRIOR APPLICATION NUMBER: 09/312,359 
PRIOR FILING DATE: 1999-05-14 
PRIOR APPLICATION NUMBER: 09/336,536 
PRIOR FILING DATE: 1999-06-18 
PRIOR APPLICATION NUMBER: 09/342,687 
PRIOR FILING DATE: 1999-06-29 
PRIOR APPLICATION NUMBER: 09/345,464 
PRIOR FILING DATE: 1999-06-30 
PRIOR APPLICATION NUMBER: 09/365,164 
PRIOR FILING DATE: 1999-07-30 
PRIOR APPLICATION NUMBER: 09/399,723 
PRIOR FILING DATE: 1999-09-20 
PRIOR APPLICATION NUMBER: 09/409,634 
PRIOR FILING DATE: 1999-09-30 
PRIOR APPLICATION NUMBER: 09/471,179 
PRIOR FILING DATE: 1999-12-23 
PRIOR APPLICATION NUMBER: 09/474,071 
PRIOR FILING DATE: 1999-12-29 
PRIOR APPLICATION NUMBER: 09/474,072 
PRIOR FILING DATE: 1999-12-29 
PRIOR APPLICATION NUMBER: 09/514,010 
PRIOR FILING DATE: 2000-02-25 
PRIOR APPLICATION NUMBER: 09/516,745 
PRIOR FILING DATE: 2000-03-01 
PRIOR APPLICATION NUMBER: 09/572,002 
PRIOR FILING DATE: 2000-05-14 
PRIOR APPLICATION NUMBER: 09/597,993 
PRIOR FILING DATE: 2000-06-19 
PRIOR APPLICATION NUMBER: 09/599,596 
PRIOR FILING DATE: 2000-06-22 
PRIOR APPLICATION NUMBER: 09/630,334 
PRIOR FILING DATE: 2000-07-31 
PRIOR APPLICATION NUMBER: 09/606,565 
PRIOR FILING DATE: 2000-06-29 
PRIOR APPLICATION NUMBER: 09/606,317 
PRIOR FILING DATE: 2000-06-29 
PRIOR APPLICATION NUMBER: 09/665,666 
PRIOR FILING DATE: 2000-09-20 
PRIOR APPLICATION NUMBER: 09/677,751 
PRIOR FILING DATE: 2000-09-30 
NUMBER OF SEQ ID NOS: 162 
SEQ ID NO 113 

LENGTH: 5036 

TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE : 
NAME/ KEY: CDS 



LOCATION: ( 230 )...( 3379 ) 
US-09-796-753-113 



Query Match 15.4%; Score 528.6; DB 10; Length 5036; 

Best Local Similarity 58.3%; Pred. No. 6.1e-160; 

Matches 987; Conservative 0; Mismatches 694; Indels 12; Gaps 3; 

Qy 515 GCTTCCGGGGCTGGCGCTGCGAGGACCGCTGTGAGCAGGGCACCTATGGTAACGACTGTC 574 

I I I I I I Ml I I I I I I I I I I I 

Db 771 GTTCCAGTGCCCCGAACTGCCTTCAGCCCTGTACCCCTGGCTACTATGGCCCTGCCTGCC 8 30 

Qy 575 ATCAGAGATGCCAGTGCCAGAATGGAGCCACCTGCGACCACGTCACGGGGGAATGCCGCT 634 

I I I I I I I I I I I I I I I I I I I I I I I I I I I IN II 

Db 831 AGTTCCGCTGCCAGTGCC ATGGGGCACCCTGCGATCCCCAGACTGGAGCCTGCTTCT 887 

Qy 635 GCCCACCAGGATACACCGGAGCCTTCTGTGAGGATCTTTGTCCTCCTGGTAAACATGGTC 694 

I I M III I I I I I I I I I I I I I I I I I I Ml I M 

Db 888 GCCCCGCAGAGAGAACTGGGCCCAGCTGTGACGTGTCCTGTTCCCAGGGCACTTCTGGCT 947 

Qy 695 CACAGT GT GAGCAGAGAT GCCCTT GT CAAAAT GGAGGAGT GT GT CAT CACGT C ACTGGAG 754 

II I I I I I I I I I I I I I I I I I I I I M I M 

Db 948 TCTTCTGCCCCAGCACCCATCCTTGCC7W\ATGGAGGTGTCTTCCAAACCCCACAGGGCT 1007 

Qy 755 AATGCTCTTGCCCTTCTGGCTGGATGGGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTC 814 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II 

Db 1008 CCTGCAGCTGCCCCCCTGGCTGGATGGGCACCATCTGCTCCCTGCCCTGCCCAGAGGGCT 1067 

Qy 815 GCT T T GGAAAGAACT GT T CC CAAGAAT GCC AGT G C CAT AAT GGAGGGACGT GT GAT GCT G 87 4 

Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1068 TTCACGGACCCAACTGCTCCCAGGAATGTCGCTGCCACAACGGCGGCCTCTGTGACCGAT 1127 

Qy 875 C C AC AG G C C AAT GT CAT T G C AGT C C AG GAT AC AC AG G G GAAC G G T G C C AG GAT G AGT G T C 934 

I I I I I Ml I II I I I I I I I I I I I I I I I II I I I I I I II I I I 

Db 1128 TCACTGGGCAGTGCCGCTGCGCTCCGGGTTACACTGGGGATCGGTGCCGGGAGGAGTGCC 1187 

Qy 935 CTGTTGGGACCTATGGCGTTCTCTGTGCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGT 994 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1188 CGGTGGGCCGCTTTGGGCAGGACTGTGCTGAGACGTGCGACTGCGCCCCGGACGCCCGTT 124 7 

Qy 995 GTTACCACGTGAGCGGCGCATGCCTCTGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAG 1054 

I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1248 GCTTCCCGGCCAACGGCGCATGTCTGTGCGAACACGGCTTCACTGGGGACCGCTGCACGG 1307 

Qy 1055 CACGCCTGTGTCCTGAGGGGCTCTACGGCATCAAATGTGACAAACGGTGTCCCTGCCACT 1114 

M I I I I I I I I I I I I I I I M I III II I I M I I I I I I I 

Db 1308 ATCGCCTCTGCCCCGACGGCTTCTACGGTCTCAGCTGCCAGGCCCCCTGCACCTGCGACC 1367 

Qy 1115 TGGAAAACACTCATAGCTGTCACCCCATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGT 117 4 

I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

Db 1368 GGGAGCACAGCCTCAGCTGCCACCCGATGAACGGGGAGTGCTCCTGCCTGCCGGGCTGGG 1427 

Qy 1175 C AGGACT CT ACT GT AAT GAGACAT GT T CT C CT GGATT CT ACGG GGAAGCT T GC CAG CAGA 1234 

I I I I I I I I I I I I I I I I II M I I Ml M I I I I I I M 

Db 1428 CGGGCCTCCACTGCAACGAGAGCTGCCCGCAGGACACGCATGGGCCAGGGTGCCAGGAGC 14 87 

Qy 1235 T CT GC AG CT GC CAAAAT GG GGC AGACT GT GACAGT GT G AC T GGAAAGT G C AC CT GT GC C C 12 94 

III I I I I I I I I I I II I I I IM II I I I I I 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 



1488 ACTGTCTCTGCCTGCACGGTGGCGTCTGCCAGGCTACCAGCGGCCTCTGTCAGTGCGCGC 1547 

1295 CAGGATTCAAAGGAATTGACTGCTCTACCCCATGCCCTCTGGGAACCTATGGGATAAACT 1354 

I I I I I I II I I II I III I I I I I I I I I I I I I I I I I I I I 

1548 CGGGTTACACGGGCCCTCACTGTGCTAGTCTTTGTCCTCCTGACACCTACGGTGTCAACT 1607 

1355 GTTCCTCTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTA 1414 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

1608 GTTCTGCACGCTGCTCATGTGAAAATGCCATCGCCTGCTCACCCATCGACGGCGAGTGCG 1667 

1415 CTTGCAAGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGATGTCCCAGTGGCACATGGG 1474 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 

1668 TCTGCAAGGAAGGTTGGCAGCGTGGTAACTGCTCTGTGCCCTGCCCACCCGGAACCTGGG 1727 

1475 GCTTTGGCTGTAACTTAACATGCCAGTGCCTCAACGGGGGAGCCTGCAACACCCTGGACG 1534 

I I I I I I I I I I MINIM I I I I I I I II I I I I I I I I 

1728 GCTTCAGTTGCAATGCCAGCTGCCAGTGTGCCCATGAGGCAGTCTGCAGCCCCCAAACTG 17 87 

1535 GGACCTGCACGTGTGCACCTGGATGGCGCGGGGAGAAATGCGAACTTCCCTGCCAGGATG 1594 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1788 GAGCCTGTACCTGCACCCCTGGGTGGCATGGGGCCCACTGCCAGCTGCCCTGTCCGAAGG 1847 



1595 



1654 



GCACGTACGGGCT GAACTGT GCT GAGCGCTGCGACT GCAGCCACGCAGATGGCT GC CACC 

I II II I I I I I I I I I I I I I I I I I I I I II I I I I I I III 
1848 GGCAGTTTGGAGAAGGTTGTGCCAGTCGCTGTGACTGTGACCACTCTGATGGCTGTGACC 1907 

1655 CTACCACGGGCCATTGCCGCTGCCTCCCGGGATGGTCAGGTGTCCACTGTGACAGCGTGT 1714 

II III III I I I I I M I I I I I I I I I II I II I 

1908 CTGTTCATGGACGCTGTCAGTGCCAGGCTGGCTGGATGGGTGCCCGCTGCCACCTGTCCT 1967 

1715 GTGCTGAGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTT 177 4 

I I I I I I I I I I M I I I I I I I I I I I I I I I I I I II I I I I I 

1968 GCCCTGAGGGCTTATGGGGAGTCAACTGTAGCAACACCTGCACCTGCAAGAATGGGGGCA 2027 

1775 CAT GCT C CC CT GAT GAT GGC AT CT G C GAGT GT G C AC C AG GCT T C C GAGGC AC CAC T T GT C 1834 

III I I I I I I I II I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I 

2 028 CCTGTCTCCCTGAGAATGGCAACTGCGTGTGTGCACCCGGATTCCGGGGCCCCTCCTGCC 2 087 

1835 AGAGGATCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGCCAGACATGCCCACAGTGCG 1894 

I I I I III I I II I I I I I I I II I II I I II I I I I 

2088 AGAGATCCTGTCAGCCTGGCCGCTATGGCAAACGCTGTGTGCC CTGCAAGTGCG 2141 

1895 TTCACAGCAGCGGGCCCTGCCACCACATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCA 1954 

III I I M I I I I I I I I I I I II Ml I I I I I I I I I I I 

2142 CTAACCACTCCTT CTGCCACCCCTCGAACGGGACCTGCTACTGCCTGGCTGGCTGGA 2198 

1955 CAGGCGCCCTCTGCAATGAAGTGTGTCCCAGTGGCAGATTTGGGAAAAACTGTGCAGGAA 2014 

I I I I I I I I I I I I I I I I II I II I I II I I I I I I I 

2199 CAGGC C CC GACT G CT C C CAG CCATGC C CT C CAG GACACT GGGGAGAAAACT GT GC C C AGA 2258 

2015 T T T GT AC CT GC AC C AAC AAC GGAAC CT GT AAC C C CAT T GAC AGAT CT T GT C AGT GT T AC C 207 4 

II II II I I I I II I I I I I II I III II II 

CCTGCCAATGTCACCATGGTGGGACCTGCCATCCCCAGGATGGGAGCTGTATCTGCCCCC 2318 



2259 



2075 C C GGTT G GATTGGCAGT GACT GCT CTCAAC CAT GTC CAC CTGCCCACTGG GGC CCAAACT 2134 

I I I I I I II I I I I I I I II I I I I I I I II I I I I I 

2319 TAGGCTGGACTGGACACCACTGCTTAGAAGGCTGCCCTCTGGGGACATTTGGTGCTAACT 2378 



Qy 2135 GCATCCACACGTGCAACTGCCATAATGGAGCTTTCTGCAGCGCCTACGATGGGGAATGTA 2194 

II III I I I I I I I I I I I I I Ml M I I I I I I Ml 

Db 2379 GCTCCCAGCCATGCCAGTGTGGTCCTGGAGAAAAGTGCCACCCAGAGACTGGGGCCTGTG 24 38 

Qy 2195 AATGCACTCCTGG 2207 

III I I I I I 
Db 2439 TATGTCCCCCAGG 2451 



RESULT 14 

US-09-833-381-1910/C 

Sequence 1910, Application US/09833381 
Patent No. US20020132090A1 
GENERAL INFORMATION: 
APPLICANT: Robison, Keith E. 

TITLE OF INVENTION : No. US20020132090Alel Nucleic Acid and Protein Homologs 
FILE REFERENCE: 5800-119 

CURRENT APPLICATION NUMBER: US/ 09/ 833 , 38 1 
CURRENT FILING DATE: 2001-04-11 
PRIOR APPLICATION NUMBER: 09/516,448 
PRIOR FILING DATE: 2000-02-29 
NUMBER OF SEQ ID NOS : 2050 

SOFTWARE: FastSEQ for Windows Version 3.0 
SEQ ID NO 1910 
LENGTH: 5197 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE : 

NAME/KEY: misc_feature 
LOCATION: (1) . . . (5197) 
OTHER INFORMATION: n = A,T,C or G 
US-09-833-381-1910 



Query Match 15.3%; Score 524.6; DB 9 

Best Local Similarity 58.0%; Pred. No. 1.3e-158 
Matches 972; Conservative 0; Mismatches 694 



Length 5197; 
Indels 11; Gaps 



2; 



Qy 



Db 



531 CT GCGAGGACCGCTGT GAGCAGGGCAC CTATGGTAACGACTGT CAT CAGAGAT GCCAGT G 590 

I I I I I I II I I I III I I I I I I I I I I I I I I I I I I I I I 

4243 CTGCCTTCAGCCCTGTACCCCTGGCTACTATGGCCCTGCCTGCCAGTTCCGCTGCCAGTG 4184 



Qy 



Db 



591 C C AGAATG GAGC CAC CT GC GAC C AC GT CAC GG GGGAAT GC C GCT GC C CAC CAGGAT ACAC 650 

II I I I I I I I I I I I I I I I I I I I I III MINI I II II 

4183 CC ATGGGGCACCCTGCGATCCCCAGACTGGAGCCTGCTTCTGCCCCGCAGAGAGAAC 4127 



Qy 



Db 



651 CGGAGCCT T CT GT GAGGAT CTTTGT CCT C CT GGTAAACATGGT CCACAGT GT GAGCAGAG 710 
I I I I I I I I I I I I I I I I I I I III II I 

4126 TGGGCCCAGCTGTGACGTGTCCTGTTCCCAGGGCACTTCTGGCTTCTTCTGCCCCAGCAC 4 067 



Qy 711 ATGCCCTTGTCAAAATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTC 77 0 

II I I I I I I I I I I I I I I I I I II I II III I I I I I I 

Db 4066 CCATCCTTGCCAAAATGGAGGTGTCTTCCAAACCCCACAGGGCTCCTGCAGCTGCCCCCC 4007 



Qy 

Db 



771 TGGCTGGATGGGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTGGAAAGAACTG 830 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I Ml Mill 

4006 TGGCTGGATGGGCACCATCTGCTCCCTGCCCTGCCCAGAGGGCTTTCACGGACCCAACTG 3947 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



831 T T C C C AAG AAT G C C AGT G C C AT AAT G GAG G G AC GT GT GAT G C T G C C AC AG G C C AAT GT C A 890 

I I I II I I I I I I I I I II I I I I I I I I I I I I I I I 

3946 CTCCCAGGAATGTCGCTGCCACAACGGCGGCCTCTGTGACCGATTCACTGGGCAGTGCCG 38 87 

891 T T G C AGT C C AGGAT AC AC AGGG GAAC GGT G C CAGGAT GAGT GT CCTGTTGG GAC CT AT GG 950 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 
3886 CTGCGCTCCGGGTTACACTGGGGATCGGTGCCGGGAGGAGTGCCCGGTGGGCCGCTTTGG 3827 

951 CGTTCTCTGTGCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACCACGTGAGCGG 1010 

I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I 

3826 GCAGGACTGTGCTGAGACGTGCGACTGCGCCCCGGACGCCCGTTGCTTCCCGGCCAACGG 3767 

1011 CGCATGCCTCTGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCCTGTGTCCTGA 107 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

3766 CGCATGTCTGTGCGAACACGGCTTCACTGGGGACCGCTGCACGGATCGCCTCTGCCCCGA 3707 

1071 GGGGCTCTACGGCATCAAATGTGACAAACGGTGTCCCTGCCACTTGGAAAACACTCATAG 1130 

II I I I I I II I I II I I I I I I I I I M 

3706 CGGCTTCTACGGTCTCAGCTGCCAGGCCCCCTGCACCTGCGACCGGGAGCACAGCCTCAG 3647 

1131 CTGTCACCCCATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGACTCTACTGTAA 1190 

III I I II I III I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

364 6 CTGCCACCCGATGAACGGGGAGTGCTCCTGCCTGCCGGGTTGGGCGGGCCTCCACTGCAA 3587 

1191 TGAGACATGTTCTCCTGGATTCTACGGGGAAGCTTGCCAGCAGATCTGCAGCTGCCAAAA 1250 

M I I II II I I I I I II I I I I I I I I III I M I I I 

3586 CGAGAGCTGCCCGCAGGACACGCATGGGCCAGGGTGCCAGGAGCACTGTCTCTGCCTGCA 3527 

1251 T GGGGCAGACT GT GAC AGT GT GACT GGAAAGT GCAC CT GT GC C CCAGGAT T CAAAG GAAT 1310 

III I I II I I I II II I I I I I I I I I I I II 

3526 CGGTGGCGTCTGCCAGGCTACCAGCGGCCTCTGTCAGTGCGCGCCGGGTTACACGGGCCC 34 67 

1311 TGACTGCTCTACCCCATGCCCTCTGGGAACCTATGGGATAAACTGTTCCTCTCGCTGTGG 1370 

I M I I III I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 

3466 TCACTGTGCTAGTCTTTGTCCTCCTGACACCTACGGTGTCAACTGTTCTGCACGCTGCTC 3407 

1371 CTGTAAAAATGATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCAAGGCAGGCTG 1430 

I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I 

34 06 ATGTGAAAATGCCATCGCCTGCTCACCCATCGACGGCGAGTGCGTCTGCAAGGAAGGTTG 3347 

1431 GCACGGGGTGGACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTGGCTGTAACTT 1490 

Ml II I II I I I I I I I II I I I I II I II I I I I II I I 

334 6 GCAGCGTGGTAACTGCTCTGTGCCCTGCCCACCCGGAACCTGGGGCTTCAGTTGCAATGC 3287 

1491 AACATGCCAGTGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCTGCACGTGTGC 1550 

I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I 

3286 CAGCTGCCAGTGTGCCCATGAGGCAGTCTGCAGCCCCCAAACTGGAGCCTGTACCTGCAC 3227 

1551 ACCTGGATGGCGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGTACGGGCTGAA 1610 

Mill I I I I II I I I II I I I I II II I I I I II I I M 

322 6 CCCTGGGTGGCATGGGGCCCACTGCCAGCTGCCCTGTCCGAAGGGGCAGTTTGGAGAAGG 3167 

1611 CTGTGCTGAGCGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCACGGGCCATTG 1670 

I I II I I I II I I II II II I I I II I II I M I I I I I III II 

3166 TTGTGCCAGTCGCTGTGACTGTGACCACTCTGATGGCTGTGACCCTGTTCATGGACGCTG 3107 



Qy 1671 CCGCTGCCTCCCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTGAGGGACGCTG 1730 

I I I I I I I I I II I I I I I I I I I II M I I I M I I M 

Db 3106 TCAGTGCCAGGCTGGCTGGATGGGTGCCCGCTGCCACCTGTCCTGCCCTGAGGGCTTATG 3047 

Qy 1731 GGGCCCCAACTGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCTCCCCTGATGA 17 90 

Ml MINI I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 3046 GGGAGTCAACTGTAGCAACACCTGCACCTGCAAGAATGGGGGCACCTGTCTCCCTGAGAA 2987 

Qy 1791 TGGCATCTGCGAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGATCTGCTCCCC 1850 

I I I I I I I I I I I I I I I I I I I II I I I I I Ml M I M I I I I I I I I II 

Db 2986 TGGCAACTGCGTGTGTGCACCCGGATTCCGGGGCCCCTCCTGCCAGAGATCCTGTCAGCC 2 927 

Qy 1851 TGGTTTTTATGGGCATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACAGCAGCGGGCC 1910 

I I I I I I I I I I I I I I I I I I 

Db 2926 TGGCCGCTATGGCAAACGCTGTG TGCCCTGCAAGTGCGCTAACCACTCCTTC 2875 

Qy 1911 CTGCCACCACATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCGCCCTCTGCAA 197 0 

I || | I I I II I I I I I I I I I I I I I I M I I I I I I I I I 

Db 2874 TGCCACCCCTTCGAACGGGACCTGCTACTGCCTGGCTGGCTGGACAGGCCCCGACTGCTC 2815 

Qy 1971 T GAAGT GT GT C C CAGT G G C AGAT T T GGGAAAAACT GT G C AGGAAT T T GT AC CT GC ACC AA 2030 

I I I I I II I I I I M II 

Db 2814 C CAGC CAT GCC CT CC AGGACACT GG GGAGAAAACT GTGC CCAGAC CT G C CAAT GT CAC C A 2755 

Qy 2031 C AAC GGAAC CT GT AAC C C CAT T GAC AGAT CT T GT CAGT GT T ACC C C GGT T GGAT T GG C AG 2090 

II 11111 I I I I II I Ml M M I I I M I I I I 

Db 2754 TGGTGGGACCTGCCATCCCCAGGATGGGAGCTGTATCTGCCCCCTAGGCTGGACTGGACA 2695 

Qy 2091 T GACT GCT CT C AAC CAT GT C C AC CT GC C C ACT G G GGC C CAAACT GC AT C CAC AC GT G C AA 2150 

I I I I I I I I MM M M I I M I I I II I 

Db 2694 CCACTGCTTAGAAGGCTGCCCTCTGGGGACATTTGGTGCTAACTGCTCCCAGCCATGCCA 2635 

Qy 2151 CTGCCATAATGGAGCTTTCTGCAGCGCCTACGATGGGGAATGTAAATGCACTCCTGG 2207 

M I II I II M I M I M 

Db 2 634 GTGTGGTCCTGGAGAAAAGTGCCACCCAGAGACTGGGGCCTGTGTATGTCCCCCAGG 257 8 



RESULT 15 
US-10-052-648A-1 

; Sequence 1, Application US/10052648A 
; Publication No. US20040005558A1 
; GENERAL INFORMATION: 
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APPLICANT: Spytek, Kiraberly A. 
APPLICANT: Stone, David J. 
APPLICANT: Vernet, Corine A.M. 
APPLICANT: Zerhusen, Bryan D. 

TITLE OF INVENTION: PROTEINS, POLYNUCLEOTIDES ENCODING THEM AND METHODS OF 
TITLE OF INVENTION: USING THE SAME 
FILE REFERENCE: 21402-250 (CURA-550) 
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Qy 62 GGACAGCATCACCTCTGAATCTTGAAGACCCTAATGTGTGTAGCCACTGGGAAAGCTACT 121 

II II I I I I I I I I I I I I I I I M I I I I I I I I I I 

Db 47 GGCTGGCTGGAACTCTCAACCCCAGTGATCCCAATACCTGCAGCTTCTGGGAAAGCTTCA 106 

Qy 122 CAGT GACT GTGCAAGAGT CATACCCACATCCCTTT GAT CAAATTT ACTACACGAGCTGC - 180 

I II I I I I I I I I I I I I I I I I I M I IM> 

Db 107 CTACCACCACCAAGGAGTCCCACTCCCGCCCCTTCAGCCTGCTCCCCTCAGAGCCCTGCG 166 

Qy 181 — ACTGACATTCTAAACTGGTTTAAATGCACGCGGCACAGAGTCAGCTATCGGACAGCCT 238 

I I I I I I I I I I I I I I I I I I I I II I I 

Db 167 AGCGGCCCTGGGAGGGCCCCCATACTTGCCCCCAGCCCACGGTTGTATACCGGACCGTGT 226 

Qy 2 39 AT CGACAT GGGGAGAAGACT AT GT AT AGGCGCAAGT CT CAGT GTTGT CCT GGATTTT AT G 2 98 

I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I 

Db 227 ACCGTCAGGTGGTGAAGACGGACCACCGCCAGCGCCTGCAGTGCTGCCATGGCTTCTATG 286 



Qy 299 AAAGCGGGGAAATGTGTGTCCCCCACTGTGCTGATAAATGTGTCCATGGTCGCTGTATTG 358 

I I I I I I I I I I I I I I I I I I I I I I I I I II II 1 I I I I I I I I I I I I I 

Db 287 AGAGCAGGGGGTTCTGTGTCCCGCTCTGTGCCCAGGAGTGTGTCCATGGCCGTTGTGTGG 34 6 

Qy 359 CTCCAAACACCTGTCAGTGTGAGCCTGGCTGGGGAGGGACCAACTGCTCCAGTGCCTGCG 418 . 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I III 

Db 34 7 CACCCAATCAGTGCCAATGTGTGCCAGGCTGGCGGGGCGACGACTGTTCCAGTGAGTGTG 4 06 

Qy 419 ATGGTGATCACTGGGGTCCCCACTGCACCAGCCGGTGCCAGTGCAAAAATGGGGCTCTGT 478 

I I I I I I I I I I I I I I I I I I III II M 

Db 4 07 CCCCAGGAATGTGGGGGCCACAGTGTGACAAGCCCTGCAGCTGCGGCAACAACAGCTCGT 466 

Qy 47 9 GCAACCCCATCACCGGGGCTTGCCACTGTGCTGCGGGCTTCCGGGGCTGGCGCTGCGAGG 538 

I I I I I I I I I I I II II I I I I I III I I I I I I 

Db 4 67 GTGATCCCAAGAGTGGGGTATGTTCTTGCCCTTCTGGTCTGCAGCCCCCGAACTGCCTTC 526 

Qy 539 AC CG CTGT GAGCAGGGCAC CT AT GGTAAC GACT GT CAT C AGAGAT G CCAGT GC CAGAAT G 598 

I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I III 

Db 527 AGCCCTGTACCCCTGGCTACTATGGCCCTGCCTGCCAGTTCCGCTGCCAGTGCC ATG 583 

Qy 599 GAGCCACCTGCGACCACGTCACGGGGGAATGCCGCTGCCCACCAGGATACACCGGAGCCT 658 

III I I I I I I I I I I I I I I III I I I I I I Ml I I I I II 

Db 584 GGGCACCCTGCGATCCCCAGACTGGAGCCTGCTTCTGCCCCGCAGAGAGAACTGGGCCCA 64 3 

Qy 659 T CT GT GAGGAT CTTTGTCCTCCTG GT AAACAT GGT CCACAGT GT GAGCAGAGAT GC CCT T 718 

I I I I I I I I I I II I II I Ml 

Db 644 GCTGTGACGTGTCCTGTTCCCAGGGCACTTCTGGCTTCTTCTGCCCCAGCACCCATTCTT 703 

Qy 719 GTCAAAATGGAGGAGTGTGTCATCACGTCACTGGAGAATGCTCTTGCCCTTCTGGCTGGA 778 

I I I I I I I I I I I I I I I II I II III I I I I I I I I I I I I I I 

Db 7 04 GCCAAAATGGAGGTGTCTTCCAAACCCCACAGGGCTCCTGCAGCTGCCCCCCTGGCTGGA 7 63 

Qy 779 T 779 

I 

Db 7 64 TGGTATGGAGGGTGGGGCCTGTGGGCATGGGGTGTGGGTCTGGGGAGAATTCTGTGGGTG 823 

Qy 7 80 GGGCACAGTGTGTGGTCAGCCTTGCCCCGAGGGTCGCTTTG 820 

I I I I I I I I I I I I I I ! I I I I I I I I I 

Db 824 GTGCTAAGCAGGGCTCCAAGGGCACCATCTGCTCCCTGCCCTGCCCAGAGGGCTTTCACG 883 

Qy 821 GAAAGAACTGTTCCCAAGAATGCCAGTGCCATAATGGAGGGACGTGTGATGCTGCCACAG 8 80 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I Mill I I I I 

Db 8 84 GACCCAACTGCTCCCAGGAATGTCGCTGCCACAACGGCGGCCTCTGTGACCGATTCACTG 943 

Qy 8 81 GCCAAT GT CAT T GC AGT C CAG GAT ACAC AGGGGAACGGT GC C AGGAT GAGT GT CCT GT T G 940 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 944 GGCAGTGCCGCTGCGCTCCGGGTTACACTGGGGATCGGTGCCGGGAGGAGTGCCCGGTGG 1003 

Qy 941 GGACCTATGGCGTTCTCTGTGCTGAGACCTGCCAGTGTGTCAACGGAGGGAAGTGTTACC 1000 

I Mill I II I II I II I II I II I I I I I I I I I I I I 

Db 1004 GCCGCTTTGGGCAGGACTGTGCTGAGACGTGCGACTGCGCCCCGGACGCCCGTTGCTTCC 1063 

Qy 1001 ACGTGAGCGGCGCATGCCTCTGTGAAGCAGGCTTTGCTGGCGAGCGCTGCGAAGCACGCC 1060 

I I I I I I I I I I I I I I I I I I I I I I I M I I II II I II I I I M I 

Db 1064 CGGCCAACGGCGCATGTCTGTGCGAACACGGCTTCACTGGGGACCGCTGCACGGATCGCC 1123 
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1061 TGTGTCCTGAGGGGCTCTACGGCATCAAATGTGACAAACGGTGTCCCTGCCACTTGGAAA 1120 

I I I I I I I I I I I I I I I I III II I I II I I I I I I I M I 

1124 TCTGCCCCGACGGCTTCTACGGTCTCAGCTGCCAGGCCCCCTGCACCTGCGACCGGGAGC 1183 

1121 ACACTCATAGCTGTCACCCCATGTCTGGAGAGTGTGCCTGCAAGCCGGGCTGGTCAGGAC 1180 

III I I I I I I I I I I I I I I I I I M I I I I I I I I I I 

1184 ACAGCCTCAGCTGCCACCCGATGAACGGGGAGTGCTCCTGCCTGCCGGGCTGGGCGGGCC 1243 

1181 T CTACT GTAAT GAGACAT GTT CTCCTGGATTCTAC GGGGAAGCTTGCCAGCAGAT CTGCA 124 0 

II I I I I I II II I I I I I M I I I I I I I I Ml 

1244 TCCACTGCAACGAGAGCTGCCCGCAGGACACGCATGGGCCAGGGTGCCAGGAGCACTGTC 1303 

1241 GCTGCCAAAATGGGGCAGACTGTGACAGTGTGACTGGAAAGTGCACCTGTGCCCCAGGAT 1300 

Ml I I I I I I I II I I I 

1304 TCTGCCTGCACGGTGGCGTCTGCCAGGCTACCAGCGGCCTCTGTCAGTGCGCGCCGGGTT 1363 

1301 TCAAAGGAATTGACTGCTCTACCCCATGCCCTCTGGGAACCTATGGGATAT^ACTGTTCCT 1360 

II II I I I I I III I I I I I I I I I I I II I I I I I I I I I I I 

1364 ACACGGGCCCTCACTGTGCTAGTCTTTGTCCTCCTGACACCTACGGTGTCAACTGTTCTG 1423 

1361 CTCGCTGTGGCTGTAAAAATGATGCAGTCTGCTCTCCTGTGGACGGGTCTTGTACTTGCA 1420 

I I I I I I III I I I I I I I I I I I I I I I I 

1424 CACGCTGCTCATGTGAAAATGCCATCGCCTGCT.CACCCATCGACGGCGAGTGCGTCTGCA 1483 

1421 AGGCAGGCTGGCACGGGGTGGACTGCTCCATCAGATGTCCCAGTGGCACATGGGGCTTTG 14 80 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

14 84 AGGAAGGTTGGCAGCGTGGTAACTGCTCTGTGCCCTGCCCACCCGGAACCTGGGGCTTCA 1543 

14 81 GCTGTAACTTAACATGCCAGTGCCTCAACGGGGGAGCCTGCAACACCCTGGACGGGACCT 1540 

Mill I I I I I I I II I I I II II Mill I III II Ml 

1544 GTTGCAATGCCAGCTGCCAGTGTGCCCATGAGGCAGTCTGCAGCCCCCA7\ACTGGAGCCT 1603 

1541 GCACGTGTGCACCTGGATGGCGCGGGGAGAAATGCGAACTTCCCTGCCAGGATGGCACGT 1600 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II 

1604 GTACCTGCACCCCTGGGTGGCATGGGGCCCACTGCCAGCTGCCCTGTCCGAAGGGGCAGT 1663 

1601 ACGGGCTGAACTGTGCTGAGCGCTGCGACTGCAGCCACGCAGATGGCTGCCACCCTACCA 1660 

II I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I 

1664 T T GGAGAAGGT TGT GCC AGT C GCT GT GACT GT GAC CACT CT GAT GGCT GT GAC C CTGT T C 1723 

1661 CGGGCCATTGCCGCTGCCTCCCGGGATGGTCAGGTGTCCACTGTGACAGCGTGTGTGCTG 1720 

II I III I I I I I II I I I I I II I I I I I II M Ml 

1724 ATGGACGCTGTCAGTGCCAGGCTGGCTGGATGGGTGCCCGCTGCCACCTGTCCTGCCCTG 1783 

1721 AGGGACGCTGGGGCCCCAACTGCTCCCTGCCCTGCTACTGTAAAAATGGGGCTTCATGCT 1780 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I III 

1784 AGGGCTTATGGGGAGTCAACTGTAGCAACACCTGCACCTGCAAGAATGGGGGCACCTGTC 1843 

17 81 CCCCTGATGATGGCATCTGCGAGTGTGCACCAGGCTTCCGAGGCACCACTTGTCAGAGGA 1840 

I I I I I I I I I I I I I I I I I I I I I II I I I II I I II I III II I II I I I I I 

1844 TCCCTGAGAATGGCAACTGCGTGTGTGCACCCGGATTCCGGGGCCCCTCCTGCCAGAGAT 1903 

1841 TCTGCTCCCCTGGTTTTTATGGGCATCGCTGCAGCCAGACATGCCCACAGTGCGTTCACA 1900 

Ml I I I I I I I I I I I I I I II I I II M I I I I I 

1904 CCTGTCAGCCTGGCCGCTATGGCAAACGCTGTGTGC CCTGCAAGTGCGCTAACC 1957 

1901 GCAGCGGGCCCTGCCACCACATCACCGGCCTGTGTGACTGCTTGCCTGGCTTCACAGGCG 1960 



I I 1 1 1 1 1 1 1 1 I I 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 1958 ACTCC— TTCTGCCACCCCTCGAACGGGACCTGCTACTGCCTGGCTGGCTGGACAGGCC 2014 

Qy 1961 CCCTCTGCAATG7UVGTGTGTCCCAGTGGCAGATTTGGGAAAAACTGTGCAGGAATTTGTA 2020 

M I I I I I II II II I I I I I M I II I I I M 

Db 2 015 CCGACTGCTCCCAGCGCTGCCCTCTGGGGACATTTGGTGCTAACTGCTCCCAGCCATGCC 2 074 

Qy 2021 CCTGCACCAAC7\ACGGAACCTGTAACCCCATTGACAGATCTTGTCAGTGTTACCCCGGTT 2080 

II I I I I I I I I I II I I I I I I 

Db 2075 AGT GT GGT C CT GGAGAAAAGT GC CAC C CAGAGACT GGGGCCTGT GT AT GT C CC C CAG GGC 2134 



Qy 2081 GGATTGG 2087 

I I I I 

Db 2135 ACAGTGG 2141 
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